F. Di Iorio, S. Fachin, A. Tarassow, Riccardo "Jack" Lucchetti
1.3
2017-11-14
Stationary bootstrap resampling of time series
C10 C32
This function creates bootstrap pseudo-data resampling a set of zero mean
stationary time series using the Stationary Bootstrap with random block size
by D.N. Politis and J.P. Romano, "The Stationary Bootstrap", Journal of the
American Statistical Association, vol. 89, 1994, pp. 1303-1313.
inputs:
x: matrix, data to be resampled. Rows: time periods, columns: variables
b: int, mean block size; if b=0: use rule-of-thumb b = round(1.75*(T^
(1/3)) as in Palm, F.C., S. Smeekes, , J.P. Urbain (2011) "Cross-sectional
dependence robust block bootstrap panel unit root tests" Journal of Econometrics,
163, 85-104.
output:
matrix of I(0) bootstrap pseudodata of the same dimensions of the input matrix
Contacts:
Francesca Di Iorio, fidiiorio@unina.it
Stefano Fachin, stefano.fachin@uniroma1.it
Note for the sample script:
Example 1 estimates 95% bootstrap confidence intervals for the cointegrating
coefficients of a relationship estimated on Johansen's danish data supplied as
Gretl example dataset. A rather small number of redrawings is used, in actual
empirical work a much larger number, at least 1000 and up to 5000, is recommended.
Real money, real income and the bond rate are found to be cointegrated by using
the Engle-Granger test. Hence, the OLS residuals are stationary. However, they
will generally be weakly dependent, so that simple redrawing as carried out by the
Gretl function "resample" cannot be applied, while block bootstrap as carried out
by "resample" with the optional "b" argument would be legitimate. However, the
block bootstrap is not guaranteed to produce stationary pseudo-residuals. This will
have serious consequences in this context.
Changelog:
1.2 -> 1.3:
- Implementation of a much faster code
- Note: The functions SB_old1 (the initial version) and SB_old2 are deprecated and
for pedagocial purposes only
1.1 -> 1.2:
- Implementation of a faster code
- Old function still available under SBslow()
data to be resampled
block size
```
# Deprecated function -- for pedagocial purpose only
n=rows(x)
# initial check
if n <= 0
funcerr "data input error, check data"
else
if b < 0
funcerr "block input error, check mean block size"
else
# if required set mean block size to rule-of-thumb value
if b==0
b = round(1.75*(n^(1/3)))
endif
scalar prob = 1/(1+b)
# starting points, block sizes, end points
start_v = ceil(muniform(n,1)*n)
sizes=ceil(log(muniform(n,1))/log(1-prob))
end_v=start_v+sizes
x_b = {}
# resampling
loop for i=1..(n-1) --quiet
if end_v[i] <= n
x_b = x_b|x[start_v[i]:end_v[i],]
else
ebb=end_v[i]-n
if ebb <=n
x_b = x_b|x[start_v[i]:n,]|x[1:ebb,]
else
x_b = x_b|x[start_v[i]:n,]|x[1:,]
endif
endif
endloop
# pseudo-data of same length of data
x_b = x_b[1:n,]
endif # close block check
endif # close data check
return x_b
```

Data to be resampled
Block size
```
# Deprecated function -- for pedagocial purpose only
scalar n = rows(x)
if n == 0
funcerr "data input error, check data"
endif
if b==0 # if required set mean block size to rule-of-thumb value
b = round(1.75*(n^(1/3)))
endif
matrix mx = x | x
matrix u = zeros(n,1)
# start resampling
u[1] = ceil(n*randgen1(u,0,1))
loop t=2..n -q
if randgen1(u,0,1)<1/b
u[t] = ceil(n*randgen1(u,0,1))
else
u[t] = u[t-1] + 1
endif
endloop
return mx[u,]
```

Data to be resampled
Block size
```
scalar n = rows(x)
if n == 0
funcerr "data input error, check data"
endif
if b==0 # if required set mean block size to rule-of-thumb value
b = round(1.75*(n^(1/3)))
endif
p = 1/b # probability of a new run
s = seq(1,n)' # sequence
u = 1 | (muniform(n-1,1) .< p) # run starts
r = cum(u) # id of each run
nr = r[n] # how many runs ?
sub = selifr(r ~ s, u) # starting row for each run
sub[,2] -= mrandgen(i, 1, n, nr, 1) # adjust starting points
# --- create mini-trends ----------------
s -= replace(r, sub[,1], sub[,2])
# roll over if necessary
s = ((s-1)%n) + 1
return x[s,]
```

#------------
# Example 1
#------------
set verbose off
include SB.gfn
# load Johansen macroeconomic data
open denmark.gdt
scalar nboot=999 # no. of bootstrap iterations
scalar b = 0 # use rule-of-thumb value
# run OLS with real money as dependent variable, real income and bond rate as explanatory variables.
list lx=LRY IBO
ols LRM 0 lx
bhat=$coeff # store coefficient vector
lrmh=$yhat # store fitted values
uh={$uhat} # store residuals as matrix
# Construct bootstrap pseudodata using rule-of-thumb block size,
# estimate model nboot times and store estimated coefficients in the matrix bb
matrix bb=zeros(nboot,$ncoeff)
loop boot=1..nboot --quiet
matrix ub=SB(uh,b) # resample
series ubs=ub
series lrmb=lrmh+ubs
#Note that since ubs is guaranteed to be stationary lrmb and lx cointegrate by construction. With the block bootstrap this is not true, and some
#of the nboot regressions of lrmb on lx may be spurious
ols lrmb 0 lx --quiet
bb[boot,]=$coeff'
endloop
# take percentiles of distribution of bootstrap estimates
matrix bb05=quantile(bb,0.05)
matrix bb95=quantile(bb,0.95)
print ""
print "******************************************************************************* "
print ""
printf "estimated coefficients %10.2f", bhat'
print ""
print "95% bootstrap confidence intervals"
print ""
printf "Lower limits %15.2f", bb05
printf "Upper limits %15.2f", bb95
print ""
print "******************************************************************************* "
/* Activate iff needed
#--------------------------------------
# Example 2: Comparison of SB() and
# SB_old1() for pedagocial purpose only
#--------------------------------------
set verbose off
include SB.gfn
open denmark.gdt -q
# run OLS with real money as dependent variable, real income and bond rate as explanatory variables.
scalar nboot=1999
scalar b = 0
list lx=LRY IBO
ols LRM 0 lx
bhat=$coeff
lrmh=$yhat
uh={$uhat}
# Run SB_old1()
matrix bb=zeros(nboot,$ncoeff)
set stopwatch
loop boot=1..nboot --quiet
ub=SB_old1(uh,b)
series ubs=ub[,1]
series lrmb=lrmh+ubs
ols lrmb 0 lx --quiet
bb[boot,]=$coeff'
endloop
# take percentiles of distribution of bootstrap estimates
matrix bb05=quantile(bb,0.05)
matrix bb95=quantile(bb,0.95)
print ""
print "******************************************************************************* "
printf "SBslow() took = %.3f sec.\n", $stopwatch
print ""
printf "estimated coefficients %10.2f", bhat'
print ""
print "95% bootstrap confidence intervals"
print ""
printf "Lower limits %15.2f", bb05
printf "Upper limits %15.2f", bb95
print ""
print "******************************************************************************* "
# Run SB()
matrix bb=zeros(nboot,$ncoeff)
set stopwatch
loop boot=1..nboot --quiet
ub=SB(uh,b)
series ubs=ub[,1]
series lrmb=lrmh+ubs
ols lrmb 0 lx --quiet
bb[boot,]=$coeff'
endloop
# take percentiles of distribution of bootstrap estimates
matrix bb05=quantile(bb,0.05)
matrix bb95=quantile(bb,0.95)
print ""
print "******************************************************************************* "
printf "SB() took = %.3f sec.\n", $stopwatch
print ""
printf "estimated coefficients %10.2f", bhat'
print ""
print "95% bootstrap confidence intervals"
print ""
printf "Lower limits %15.2f", bb05
printf "Upper limits %15.2f", bb95
print ""
print "******************************************************************************* "
*/