Package 'SIMle'

Title: Estimation and Inference for General Time Series Regression
Description: We provide functions for estimation and inference of nonlinear and non-stationary time series regression using the sieve methods and bootstrapping procedure.
Authors: Xiucai Ding [aut, cre, cph], Chen Qian [aut, cph]
Maintainer: Xiucai Ding <[email protected]>
License: MIT + file LICENSE
Version: 0.1.0
Built: 2025-02-02 03:55:53 UTC
Source: https://github.com/cran/SIMle

Help Index


Automated exact form test

Description

This function utilizes L2 test for the automated execution of exact form tests with chosen bases.

Usage

auto.exact.test(
  ts,
  c,
  d,
  b_time,
  b_timese,
  mp_type,
  ops,
  exact_func,
  m = "MV",
  r = 1,
  s = 1,
  per = 0,
  k = 0,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

the maximum value of number of basis for time input

d

the maximum value of number of basis for variate input

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

ops

Criteria for choosing the number of bases are provided by the package, offering four options: "AIC," "BIC," "CV," and "Kfold," each corresponding to a specific Criteria "AIC" stands for Akaike Information Criterion, "BIC" stands for Bayesian Information Criterion, "CV" represents cross-validation, and "Kfold" corresponds to k-fold cross-validation for time series data

exact_func

A list contains elements that are matrix contain exact functions, which are desired to be tested. The k-th element represents the k-th variable. The matrix contains values of the exact function within its domain

m

the window size for the simultaneous confidence region procedure, with the default being 'MV,' which stands for the Minimum Volatility method

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

per

the percentage for test set used in cross validation option "CV"

k

the number of fold used in k-fold cross validation "Kfold"

upper

upper The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper

Details

In the parameter type, this package provides 32 types of bases, including options such as 'Legen' for Legendre polynomials, 'Cheby' for the first kind of Chebyshev polynomials, 'tri' for trigonometric polynomials, 'cos' for cosine polynomials, 'sin' for sine polynomials, and 'Cspli' for the class of spline functions. In the 'Cspli' option, the first input 'c' represents knots plus 2, which correspond to 0 and 1. The term 'or' indicates the order of splines, so the number of basis elements is the number of knots + 2 - 2 plus the number of the order. When functions automatically choose the number of basis elements for splines, the number is not less than the order of the spline. The package provides 'db1' to 'db20' for Daubechies1 wavelet basis to Daubechies20 wavelet basis, and 'cf1' to 'cf5' for Coiflet1 wavelet basis to Coiflet5 wavelet basis. The wavelet tables provided by the Sie2nts package are generated by the Cascade algorithm using a low-pass filter. If exact values of wavelets are required, the Recursion algorithm should be used.

Value

A list whose elements are p value of exact form test. Each element in the list represents p-values in the order of variates.

References

[1] Ding, Xiucai, and Zhou, Zhou. “Estimation and inference for precision matrices of nonstationary time series.” The Annals of Statistics 48(4) (2020): 2455-2477.

[2] Ding, Xiucai, and Zhou, Zhou. “Auto-regressive approximations to non-stationary time series, with inference and applications.” Available online, 2021.

[3] Ding, Xiucai, and Zhou Zhou. "Simultaneous Sieve Inference for Time-Inhomogeneous Nonlinear Time Series Regression." Available online, 2021.


Automated estimation of nonlinear time series regression

Description

This function estimates nonlinear time series regression by sieve methods with chosen bases.

Usage

auto.fit(
  ts,
  c,
  d,
  b_time,
  b_timese,
  mp_type,
  type,
  ops,
  per = 0,
  k = 0,
  fix_num = 0,
  r = 1,
  s = 1,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

the maximum value of number of basis for time input

d

the maximum value of number of basis for variate input

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

type

select type of estimation."nfix" refers to no fix estimation. "fixt" indicates fix time t estimation. "fixx" represents fix variate estimation

ops

Criteria for choosing the number of bases are provided by the package, offering four options: "AIC," "BIC," "CV," and "Kfold," each corresponding to a specific Criteria

per

the percentage for test set used in cross validation option "CV"

k

the number of fold used in k-fold cross validation "Kfold"

fix_num

fix_num indicates the use of fixed-value nonlinear time series regression. The default value is 0, which is employed for non-fixed estimation. If "fixt" is chosen, it represents a fixed time value. Otherwise, if not selected, it pertains to a fixed variate value

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

upper

upper The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper

Value

If "nfix" is selected, the function returns a list where each element is a matrix representing the estimation function in two dimensions. Otherwise, if "nfix" is not selected, the function returns a list where each element is a vector representing the estimation function.


Automated time-homogeneity test

Description

This function utilizes Simultaneous Confidence Regions (SCR) for the automated execution of time-homogeneity tests with chosen bases.

Usage

auto.homo.test(
  ts,
  c,
  d,
  b_time,
  b_timese,
  mp_type,
  ops,
  m = "MV",
  fix_num = 0,
  r = 1,
  s = 1,
  per = 0,
  k = 0,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

the maximum value of number of basis for time input

d

the maximum value of number of basis for variate input

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

ops

Criteria for choosing the number of bases are provided by the package, offering four options: "AIC," "BIC," "CV," and "Kfold," each corresponding to a specific Criteria

m

the window size for the simultaneous confidence region procedure, with the default being 'MV,' which stands for the Minimum Volatility method

fix_num

fix_num indicates fixed value for time

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

per

the percentage for test set used in "CV" option

k

the number of fold used in "Kfold" option

upper

upper The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper

Value

A list is returned, containing dataframes with three columns each. The first column pertains to input values, the second column contains values of the estimated function along with their upper and lower bounds, which are used for time-homogeneity testing. The third column serves as a factor indicating the types corresponding to the values in the second column.


Automated creation of a Simultaneous Confidence Region (SCR) for the estimated function

Description

This function generates a Simultaneous Confidence Region (SCR) for the estimated function with chosen bases.

Usage

auto.SCR(
  ts,
  c,
  d,
  b_time,
  b_timese,
  mp_type,
  type,
  ops,
  m = "MV",
  fix_num = 0,
  r = 1,
  s = 1,
  per = 0,
  k = 0,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

the maximum value of number of basis for time input

d

the maximum value of number of basis for variate input

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

type

select type of estimation."fixt" indicates fix time t estimation. "fixx" represents fixed variate estimation

ops

Criteria for choosing the number of bases are provided by the package, offering four options: "AIC," "BIC," "CV," and "Kfold," each corresponding to a specific Criteria

m

the window size for the simultaneous confidence region procedure, with the default being 'MV,' which stands for the Minimum Volatility method

fix_num

fix_num indicates the use of fixed-value nonlinear time series regression. If "fixt" is chosen, it represents a fixed time value. Otherwise, if not selected, it pertains to a fixed variate value

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

per

the percentage for test set used in "CV" option

k

the number of fold used in "Kfold" option

upper

upper The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper

Value

A list containing dataframes with three columns each. The first column corresponds to input values. The second column contains values of the estimated function, along with their upper and lower bounds. The third column is a factor that indicates the types associated with the values in the second column.


Automated separability test

Description

This function utilizes Simultaneous Confidence Regions (SCR) for the automated execution of separability tests with with chosen bases.

Usage

auto.sep.test(
  ts,
  c,
  d,
  b_time,
  b_timese,
  mp_type,
  type,
  ops,
  m = "MV",
  fix_num = 0,
  r = 1,
  s = 1,
  per = 0,
  k = 0,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

the maximum value of number of basis for time input

d

the maximum value of number of basis for variate input

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

type

select type of estimation."fixt" indicates fix time t. "fixx" represents fix variate

ops

Criteria for choosing the number of bases are provided by the package, offering four options: "AIC," "BIC," "CV," and "Kfold," each corresponding to a specific Criteria

m

the window size for the simultaneous confidence region procedure, with the default being 'MV,' which stands for the Minimum Volatility method

fix_num

fix_num indicates the use of fixed-value nonlinear time series regression. If "fixt" is chosen, it represents a fixed time value. Otherwise, if not selected, it pertains to a fixed variate value

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

per

the percentage for test set used in "CV" option

k

the number of fold used in "Kfold" option

upper

upper The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper

Value

A list containing dataframes with three columns each. The first column represents input values. The second column contains values of the estimated function, along with their upper and lower bounds, which are used for separability testing. The third column is a factor indicating the types corresponding to the values in the second column.


Generate Mapping Basis

Description

this function generates the value of k-th basis function. (The wavelet basis options return the full table)

Usage

bs.gene.trans(
  type,
  mp_type,
  k,
  upper = 10,
  s = 1,
  n_esti = 500,
  c = 10,
  or = 4
)

Arguments

type

type indicates which type of basis is used

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

k

k-th basis function

upper

the upper bound for basis domain, the default is 10

s

s is a positive scaling factor, the default is 1

n_esti

the number of values got from k-th basis function, the default is 500

c

c only used in Cspli which indicates the total number of knots to generate, the default is 10, c should not be less than k.(for splines, the true number of basis is c-2+or)

or

indicates the order of spline and only used in Cspli type, default is 4 which indicates cubic spline

Value

A matrix in which the k-th column corresponds to the values of the k-th mapped basis function

References

[1] Chen, Xiaohong. “Large Sample Sieve Estimation of Semi-Nonparametric Models.” Handbook of Econometrics, 6(B): 5549–5632,2007.

Examples

bs.gene.trans("Legen", "algeb", 5)

Plots of mapping basis

Description

This function generates the plot of first k basis function.

Usage

bs.plot.trans(type, mp_type, k, upper = 10, s = 1, or = 4, title = "")

Arguments

type

type indicates which type of basis is used

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

k

The k is the number of basis functions represented (If wavelet are chosen, the real number of basis is 2^k. If Cspli is chosen, the real number of basis is k - 2 + or)

upper

the upper bound for basis domain, the default is 10

s

s is a positive scaling factor, the default is 1

or

indicates the order of spline and only used in Cspli type, default is 4 which indicates cubic spline

title

give the title for the basis plot

Value

The plot of 1 to k basis functions

Examples

bs.plot.trans("Legen", "algeb", 2)

Visualization of the cross-validation results

Description

Visualization of the cross-validation results

Usage

cv.plot(cv_m, title = "")

Arguments

cv_m

give the cross validation data frame

title

give the title for plot

Value

the plot shows cross validation result (3D)


Cross validation result by specific criteria

Description

this function gets the cross validation result by specific criteria.

Usage

cv.res(ts, c, d, b_time, b_timese, mp_type, ops, r = 1, s = 1, per = 0, k = 0)

Arguments

ts

ts is the data set which is a time series data typically

c

the maximum value of number of basis for time input

d

the maximum value of number of basis for variate input

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

ops

Criteria for choosing the number of bases are provided by the package, offering four options: "AIC," "BIC," "CV," and "Kfold," each corresponding to a specific Criteria

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

per

the percentage for test set used in "CV" option

k

the number of fold used in "Kfold" option

Value

A data frame containing the criterion values corresponding to "c" and "d". The first element refers to the optimal number of basis for time input, and the second element refers to the optimal number of basis for variate.


Exact form test

Description

This function employs the L2 test for the user-specific execution of exact form tests.

Usage

exact.test(
  ts,
  c,
  d,
  m = "MV",
  b_time,
  b_timese,
  mp_type,
  exact_func,
  r = 1,
  s = 1,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

number of basis for time input

d

number of basis for variate input

m

the window size for the simultaneous confidence region procedure, with the default being 'MV,' which stands for the Minimum Volatility method

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

exact_func

A list contains elements that are matrix contain exact functions, which are desired to be tested. The k-th element represents the k-th variable. The matrix contains values of the exact function within its domain

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

upper

The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper.

Value

A list whose elements are p value of exact form test. Each element in the list represents p-values in the order of variates.


Visualization of estimation

Description

Visualization of estimation

Usage

fit.plot(
  res_esti,
  ops,
  mp_type,
  title = "",
  lower = -1.3,
  upper = 1.3,
  domain = 10
)

Arguments

res_esti

the result of estimation

ops

select type of estimation."nfix" refers to no fix estimation. "fixt" indicates fix time t estimation. "fixx" represents fix variate estimation

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

title

give the title for plot

lower

give the lower bound for scale limits, the default is -1.3

upper

give the upper bound for scale limits, the default is 1.3

domain

The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -domain to domain.

Value

the plot shows estimated function

Examples

generate_nAR1 = function(n, v){
 ts = c()
 w = rnorm(n, 0, 1/v)
 x_ini = runif(1,0,1)
 for(i in 1:n){
   if(i == 1){
     ts[i] = sin(2*pi*(i/n))*exp(-x_ini^2)  + w[i] #
   } else{
     ts[i] = sin(2*pi*(i/n))*exp(-ts[i-1]^2) + w[i]
   }
 }
 return(ts)
}
ts  = generate_nAR1(200, 1) # change sample size in real case
res_esti = fix.fit(ts, 5, 2, "Legen", "Legen", "algeb", "fixt", 0.1)
fit.plot(res_esti[[1]], "fixt", "algeb")

User-specified estimation of nonlinear time series regression

Description

This function estimates nonlinear time series regression by sieve methods

Usage

fix.fit(
  ts,
  c,
  d,
  b_time,
  b_timese,
  mp_type,
  type,
  fix_num = 0,
  r = 1,
  s = 1,
  n_esti = 2000,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

number of basis for time input

d

number of basis for variate input

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

type

select type of estimation."nfix" refers to no fix estimation. "fixt" indicates fix time t estimation. "fixx" represents fix variate estimation

fix_num

fix_num indicates the use of fixed-value nonlinear time series regression. The default value is 0, which is employed for non-fixed estimation. If "fixt" is chosen, it represents a fixed time value. Otherwise, if not selected, it pertains to a fixed variate value

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

n_esti

number of points for estimation, the default is 2000

upper

upper The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper

Value

If "nfix" is selected, the function returns a list where each element is a matrix representing the estimation function in two dimensions. Otherwise, if "nfix" is not selected, the function returns a list where each element is a vector representing the estimation function.


User-specified creation of a Simultaneous Confidence Region (SCR) for the estimated function

Description

This function generates a Simultaneous Confidence Region (SCR) for the estimated function

Usage

fix.SCR(
  ts,
  c,
  d,
  m = "MV",
  b_time,
  b_timese,
  mp_type,
  type,
  fix_num = 0,
  r = 1,
  s = 1,
  n_point = 4000,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

the maximum value of number of basis for time input

d

the maximum value of number of basis for variate input

m

the window size for the simultaneous confidence region procedure, with the default being 'MV,' which stands for the Minimum Volatility method

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

type

select type of estimation."fixt" indicates fixed time t value. "fixx" represents fix variate value

fix_num

fix_num indicates the use of fixed-value nonlinear time series regression. If "fixt" is chosen, it represents a fixed time value. Otherwise, if not selected, it pertains to a fixed variate value

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

n_point

number of points for SCR, the default is 4000

upper

upper The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper

Value

A list containing dataframes with three columns each. The first column corresponds to input values. The second column contains values of the estimated function, along with their upper and lower bounds. The third column is a factor that indicates the types associated with the values in the second column.


User-specified time-homogeneity test

Description

This function utilizes Simultaneous Confidence Regions (SCR) for the automated execution of time-homogeneity tests

Usage

homo.test(
  ts,
  c,
  d,
  m = "MV",
  b_time,
  b_timese,
  mp_type,
  fix_num = 0,
  r = 1,
  s = 1,
  n_point = 4000,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

number of basis for time input

d

number of basis for variate input

m

the window size for the simultaneous confidence region procedure, with the default being 'MV,' which stands for the Minimum Volatility method

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

fix_num

fix_num indicates fixed value for time

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

n_point

number of points for SCR, the default is 2000

upper

upper The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper

Value

A list is returned, containing dataframes with three columns each. The first column pertains to input values, the second column contains values of the estimated function along with their upper and lower bounds, which are used for time-homogeneity testing. The third column serves as a factor indicating the types corresponding to the values in the second column.


Visualization of simultaneous confidence region (SCR)

Description

Visualization of simultaneous confidence region (SCR)

Usage

scr.plot(scr_df, ops, title = "", lower = -1.3, upper = 1.3)

Arguments

scr_df

the result of estimation

ops

select type of estimation."nfix" refers to no fix estimation. "fixt" indicates fix time t estimation. "fixx" represents fix variate estimation

title

give the title for plot

lower

give the lower bound for scale limits, the default is -1.3

upper

give the upper bound for scale limits, the default is 1.3

Value

the plot shows estimated function and its simultaneous confidence region (SCR)

Examples

generate_nAR1 = function(n, v){
 ts = c()
 w = rnorm(n, 0, 1/v)
 x_ini = runif(1,0,1)
 for(i in 1:n){
   if(i == 1){
     ts[i] = sin(2*pi*(i/n))*exp(-x_ini^2)  + w[i] #
   } else{
     ts[i] = sin(2*pi*(i/n))*exp(-ts[i-1]^2) + w[i]
   }
 }
 return(ts)
}
ts  = generate_nAR1(27, 1) #change sample size in real case. 
res_esti = fix.SCR(ts, 1, 1, m = "MV", "Legen", "Legen", "algeb", "fixt", 0.6, r = 1)
scr.plot(res_esti[[1]], "fixt")

User-specified separability test

Description

This function utilizes Simultaneous Confidence Regions (SCR) for the automated execution of separability tests

Usage

sep.test(
  ts,
  c,
  d,
  m = "MV",
  b_time,
  b_timese,
  mp_type,
  type,
  fix_num = 0,
  r = 1,
  s = 1,
  n_point = 2000,
  upper = 10
)

Arguments

ts

ts is the data set which is a time series data typically

c

the maximum value of number of basis for time input

d

the maximum value of number of basis for variate input

m

the window size for the simultaneous confidence region procedure, with the default being 'MV,' which stands for the Minimum Volatility method

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

type

select type of estimation."fixt" indicates fix time t estimation. "fixx" represents fix variate estimation

fix_num

fix_num indicates the use of fixed-value nonlinear time series regression. If "fixt" is chosen, it represents a fixed time value. Otherwise, if not selected, it pertains to a fixed variate value

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

n_point

number of points for SCR, the default is 2000

upper

upper The upper bound for the variate basis domain. The default value is 10. When "algeb" or "logari" is chosen, the domain is automatically set from -upper to upper

Value

A list containing dataframes with three columns each. The first column represents input values. The second column contains values of the estimated function, along with their upper and lower bounds, which are used for separability testing. The third column is a factor indicating the types corresponding to the values in the second column.


Predicting time series with 1 step

Description

This function predicts the time series data basis on the estimation.

Usage

series.predict(
  ts,
  c,
  d,
  b_time,
  b_timese,
  mp_type,
  r = 1,
  s = 1,
  n_esti = 2000
)

Arguments

ts

ts is the data set which is a time series data typically

c

number of basis for time input

d

number of basis for variate input

b_time

type of basis for time input

b_timese

type of basis for variate input

mp_type

select type of mapping function, "algeb" indicates algebraic mapping on the real line. "logari" represents logarithmic mapping on the real line

r

indicates number of variate

s

s is a positive scaling factor, the default is 1

n_esti

number of points for estimation, the default is 2000

Value

predictive values for time series


Visulization of Simultaneous Confidence Region(SCR) for test result

Description

Visulization of Simultaneous Confidence Region(SCR) for test result

Usage

test.plot(df, type, ops = "", title = "", lower = -1.3, upper = 1.3)

Arguments

df

the result of test (estimated function under null and Simultaneous Confidence Region (SCR) )

type

specify type of test, "homot" represents time-homogeneity test. "separa" is separability test

ops

select type of estimation."nfix" refers to no fix estimation. "fixt" indicates fix time t estimation. "fixx" represents fix variate estimation

title

give the title for plot

lower

give the lower bound for scale limits, the default is -1.3

upper

give the upper bound for scale limits, the default is 1.3

Value

the plot shows test estimated function and simultaneous confidence region (SCR)