`library(OBL)`

ARIMA method has limitations in the area of small sample sizes among others. although, analysis of small sample series are available in few cases, there is currently no widely applicable and easily accessible method that can be used to make small sample inference. Methods like Edgeworth’s expansions involve a lot of algebra (which might discourage its users) and are also applicable in very special cases. The regular bootstrap method that could be a potential alternative failed on the grand of conflicting assumptions. The normal bootstrap method depends on assumption that observations are independent and identically distributed (i.i.d.), while a typical time series data are dependent in nature.

To find a way to ovoid this assumption of i.i.d. on normal bootstrap
method in Efron (1979) and still maintain
the dependence structure of time series data, one can hold reasonable
amount of dependence structure within the series in a way by slicing a
time series data into a number of chunks each with a length *l*.
This way the dependence structure within each block is kept. Instead of
sampling each unit randomly with replacement (as it would have been done
for traditional bootstrap method), the chunks are rather sampled. This
will distort certain amount of dependence structure of the series only
among blocks are distorted (as serial correlation is distorted among the
blocks), while the i.i.d. is invariably preserved. This way, one is able
to coerce the i.i.d. assumption of the regular bootstrap method and the
assuption of presence of serial correlation of a typical time series
data in one method. The broad name given to method that achieves these
two opposing objective is called Block Bootstrap Methods.

The main challenge with block bootstrap procedures is the
responsiveness of Root Mean Squared Error (RMSE) to the preference of
block length (*l*), or the number of blocks (*m*). This is
one problem define in two ways, `OBL: Optimum Block Lengt`

package has chosen to approached this problem with **the
preference of block length**. Diverse methods can be used (which
are explained briefly below), each method has numerous block lengths
which must be considered, it is this problem that the
`OBL: Optimum Block Lengt`

package is here to solve.

The OBL package provides optimum block length to five(5) different block bootrap methods vized:

The Non-overlapping Block Bootstrap (NBB) uses a method described in Carlstein (1986) which splits original series into Non-overlapping blocks and thereafter resamples the blocks in multiple times(which is named

*R*) to form a new series.The Moving Block Bootstrap (MBB) otherwise called Overlaping Block Bootstrap uses a method described in Kunsch (1989) which splits original series into overlapping blocks and thereafter resamples the blocks in multiple times(which is named

*R*) to form a new series.The Circular Block Bootstrap (CBB) uses a method described in Politis and Romano (1992) is an improvement on MBB Kunsch (1989) such that in which provisions are made for observations at the tail end of the original series that could have been cut off from resampling simply because the left over element(s) is not equal to predetermined block length. This happens when original series is not divisible by \(n - l + 1\), where \(n\) is the number of original series and \(l\) is the predetermined block sizes \(1 < l < n\). Such provision is made up by completing the so called left-over by adding the first element(s) of the original series to form a circle. Afterwards, the blocks in multiple times(which is named

*R*) are resampled to form a new series.The Tapered Moving Block Bootstrap (TMBB unpublished) is formed to reduce the less representative presence of extreme member of the series from \(2l\) to just 2.. Reduction of less-represented elements of the series will help to increase the performance of model evaluation metrics (RMSE and MAE). Afterwards, the blocks in multiple times(which is named

*R*) are resampled to form a new series.The Tapered Circular Block Bootstrap (TCBB unpublished) is an extension from TMBB such that the last block contains the first element of the parent series as its last sub-series element. It is formed to reduce the less representative presence of extreme member of the series from \(2l\) (in the case of MBB) and from 2 (in the case of TMBB) to just 1. Reduction of less-represented element of the series will help to increase the performance of model evaluation metrics (which leads to reduced RMSE). Afterwards, the blocks in multiple times(which is named

*R*) are resampled to form a new series.

It also checks for every possible block length *l* (where
\(1 < l < n\) for \(n\) is the length of the original time
series data) in each method to know which one is optimal by calculating
RMSE value for every possible block length of each method and sorting
out which of them is minimum in value. The minimum RMSE values for every
method is sorted out in a data frame(with three(3) columns namely:
Methods, lb and RMSE) to let the `OBL: Optimum Block Lengt`

package users choose the method and the block length with the minimal
RMSE value from the output data frame

You can install the development version from GitHub with:

```
install.packages("devtools")
::install_github("sta189332/OBL") devtools
```

It is observed that the optimum block length of any time series data
is contingent(dependent) on the uniqueness of every time series data.
Block bootstrap users thus, need to be flexible in choosing the optimum
block length by adopting to a concise but clear while such method must
be easy to use as well. As a result of the such a need,
`OBL: Optimum Block Lengt`

package is created to solve such
problem. `OBL: Optimum Block Length`

package helps users to
search for the best block length and the best method that has the
minimum RMSE value.

blockboot function produces a data frame with three (3) column (Method,
lb & RMSE).

lolliblock function is another function that can plot the lollipop
chart of the data frame displays by the blockboot function. It shows the
optimum block lengths, for each method with different colours ranging
from red to green. While red shows the method with worst performance
(method with the highest RMSE) the green colour shows the method with
the smallest RMSE. The corresponding block length of each methods as a
**legend** with their matching colours.

The minimum arguments in the function `blockboot()`

can be
the `ts`

which should be a univirate time series data and
`R`

which is the numbers of replicate of resapling.

```
blockboot(ts,
R,
seed,
n_cores,methods = c("optnbb", "optmbb", "optcbb", "opttmbb", "opttcbb"))
```

While the minimum arguments in the function `lolliblock()`

can be the `ts`

which should be a univirate time series data
and `R`

which is the numbers of replicate of resapling.

```
lolliblock(ts,
R,
seed,
n_cores,methods = c("optnbb", "optmbb", "optcbb", "opttmbb", "opttcbb"))
```

ts

R

seed

n_cores

Methods

univariate time series data

Number of replication for resampling

RNG seed

number of core(s) to be used on your operating system

methods is optional, if specified, it must be any combination as follows: “optnbb”, “optmbb”, “optcbb”, “opttmbb”, “opttcbb”

The suction output a data frame with 5 rows 3 columns which are “Methods”, “lb” and “RMSE”. Method with the minimum RMSE value is

```
# simulate univariate time series data
set.seed(289805)
<- arima.sim(n = 10, model = list(ar = 0.8, order = c(1, 0, 0)), sd = 1)
ts # get the optimal block length table
::blockboot(ts = ts, R = 100, seed = 6, n_cores = 2)
OBL# Methods lb RMSE
#1 nbb 9 0.2402482
#2 mbb 9 0.1023012
#3 cbb 8 0.2031448
#4 tmbb 4 0.2654746
#5 tcbb 9 0.4048711
```

The suction output a lollipop chart with 5 pops for the 5 methods
separated with 5 distinct colours while the method with red lollipop
indicates the least desired method with the highest RMSE and the method
with green lollipop indicates the preferred method having the lowest
RMSE. The **legend** beside the chart indicate the optimum
block length for each method.

```
# simulate univariate time series data
set.seed(289805)
<- arima.sim(n = 10, model = list(ar = 0.8, order = c(1, 0, 0)), sd = 1)
ts # get the optimal block length table
::lolliblock(ts = ts, R = 100, seed = 6, n_cores = 2) OBL
```

vignette(“factors.cc”, package=“rQCC”)

Carlstein, Edward. 1986. “The Use of Subseries Values for
Estimating the Variance of a General Statistic from a Stationary
Sequence.” *The Annals of Statistics*, 1171–79.

Efron, B. 1979. “Bootstrap Methods: Another Look at the
Jackknife.” *Ann. Statist.* 7 (1): 1–26.

Kunsch, Hans R. 1989. “The Jackknife and the Bootstrap for General
Stationary Observations.” *The Annals of Statistics*,
1217–41.

Politis, Dimitris N, and Joseph P Romano. 1992. “A Circular
Block-Resampling Procedure for Stationary Data.” *Exploring
the Limits of Bootstrap*, 263–70.