Vote count:
0
I'm trying to extract one-step forecasts from an ARIMA model with two external regressors as described on Prof Hyndman's blog here. I first generate a model using auto.arima, and then apply this model to the full set.
The code works as it should for the first firm in my sample. The second firm, however, causes an error when extracting the one-step forecasts:
Error in `[<-.default`(`*tmp*`, , "drift", value = c(1.00000000000909, :
subscript out of bounds
The following code contains the time series that causes an error:
df <-structure(list(fYearQtr = c(2004.5, 2004.75, 2005, 2005.25, 2005.5,
2005.75, 2006, 2006.25, 2006.5, 2006.75, 2007, 2007.25, 2007.5,
2007.75, 2008, 2008.25, 2008.5, 2008.75, 2009, 2009.25, 2009.5,
2009.75, 2010, 2010.25, 2010.5, 2010.75, 2011, 2011.25, 2011.5,
2011.75, 2012, 2012.25, 2012.5, 2012.75, 2013, 2013.25, 2013.5,
2013.75, 2014, 2014.25), Sales = c(2014, 2350, 3490, 3243, 3520,
3678, 5749, 4359, 4370, 4837, 7115, 5264, 5410, 6217, 9608, 7512,
7464, 7895, 11880, 9084, 9734, 12207, 15683, 13499, 15700, 20343,
26741, 24667, 28571, 28270, 46333, 39186, 35023, 35966, 54512,
43603, 35323, 37472, 57594, 45646), last_SVI = c(17, 23, 25,
20, 20, 28, 31, 22, 21, 30, 32, 22, 29, 34, 39, 26, 24, 34, 38,
24, 28, 33, 34, 22, 38, 34, 38, 34, 34, 40, 52, 34, 34, 58, 54,
31, 32, 53, 48, 30), SD_SVI = c(0.898717034272917, 1.66410058867569,
2.35883500145783, 2.49615088301353, 1.48064435037847, 2.87562702192596,
3.45854571482559, 2.26738299389972, 1.05003052458683, 3.67772226053586,
3.19855736712181, 5.65685424949238, 2.66024868704471, 5.10153320342434,
3.77236918007361, 2.79880927062444, 2.59437260831385, 3.0697030675746,
4.66162731573098, 2.33973480855395, 3.43063124938119, 3.71069141390533,
3.78255103173669, 9.43873633436932, 4.36918111203273, 3.44368615860597,
4.85032380626706, 3.51188458428425, 2.16617351389673, 3.01066480434182,
13.8264358990424, 5.36966789786234, 3.3166247903554, 14.2644438718921,
7.43260316064229, 2.96777564982468, 4.21383557538856, 12.3594664228036,
6.83880331412088, 2.01913919206257)), .Names = c("fYearQtr",
"Sales", "xReg1", "xReg2"), row.names = c(NA, -40L), class = "data.frame")
Example data:
head(df)
fYearQtr Sales xReg1 xReg2
1 2004.50 2014 17 0.898717
2 2004.75 2350 23 1.664101
3 2005.00 3490 25 2.358835
4 2005.25 3243 20 2.496151
5 2005.50 3520 20 1.480644
6 2005.75 3678 28 2.875627
Building a time series object, train/test set and extracting the one-step forecasts:
require(forecast)
TS <- ts(df[,2:4], start = c(2004,3), end = c(2014,2), frequency=4)
TS.TRAIN <- window(TS, end=2011.4)
TS.TEST <- window(TS, start=2011.5)
# Build an arima model
MODEL <- auto.arima(TS.TRAIN[,'Sales'], xreg=TS.TRAIN[,colnames(TS.TRAIN) %in% c('xReg1', 'xReg2')])
FCAST <- forecast(MODEL, xreg=TS.TEST[,colnames(TS.TEST) %in% c('xReg1', 'xReg2')])
# Resulting model: ARIMA(0,1,0)(1,0,1)[4] with drift
Now extract 1-step forecasts:
refit <- Arima(TS[,'Sales'], model=MODEL, xreg=TS[,colnames(TS) %in% c('xReg1', 'xReg2')])
## Error in `[<-.default`(`*tmp*`, , "drift", value = c(1.00000000000909, :
subscript out of bounds
The confusing part: the exact same code works when using the following data frame (different firm):
#########################################
# Other example: works just fine?
df_noissues <- structure(list(fQtrYear = c(2004.5, 2004.75, 2005, 2005.25, 2005.5,
2005.75, 2006, 2006.25, 2006.5, 2006.75, 2007, 2007.25, 2007.5,
2007.75, 2008, 2008.25, 2008.5, 2008.75, 2009, 2009.25, 2009.5,
2009.75, 2010, 2010.25, 2010.5, 2010.75, 2011, 2011.25, 2011.5,
2011.75, 2012, 2012.25, 2012.5, 2012.75, 2013, 2013.25, 2013.5,
2013.75, 2014, 2014.25), Sales = c(5818, 5979, 6221, 6410, 6401,
6536, 7111, 7797, 7631, 7840, 7908, 8066, 7387, 7387, 6998, 7245,
6970, 5688, 4147, 4244, 4615, 5433, 4887, 5187, 5287, 5652, 5958,
6585, 6419, 5989, 6006, 5963, 5833, 5898, 5833, 5849, 5765, 5585,
5454, 5836), mean_SVI = c(61.1666666666667, 47.9166666666667,
48.5833333333333, 51.4166666666667, 56, 51.8461538461538, 50.1666666666667,
60.75, 53.1538461538462, 48.9230769230769, 53, 53.6923076923077,
55.8461538461538, 46.3333333333333, 51.25, 54.1666666666667,
52.4166666666667, 50.4166666666667, 54.4166666666667, 49.3333333333333,
49.1666666666667, 39.5833333333333, 41.8333333333333, 43.9166666666667,
39.8333333333333, 37.1666666666667, 45.25, 45.9166666666667,
45.8333333333333, 39.7692307692308, 52.8461538461538, 60.6153846153846,
44.0769230769231, 37.75, 47.5, 45.1666666666667, 42.1666666666667,
39.25, 47.25, 47.4166666666667), SD_SVI = c(9.29157324317757,
11.0737883255365, 8.37157890324957, 6.08213977498251, 7.80442764775809,
9.09987320283598, 6.16195561244131, 11.2583302491977, 10.4390784542678,
8.38114489455884, 9.69535971483266, 11.4118696641159, 6.84161474417351,
8.96795642408249, 3.22278817739603, 6.23528570947538, 4.73782330790941,
9.3269729410149, 16.1777531496094, 10.9903538972992, 9.64679252708412,
11.1147595020261, 11.1586357371836, 7.22946412365063, 7.99810583636507,
6.89971453076579, 7.97866473221497, 3.89541299790439, 6.24984848301189,
7.5294294400245, 17.0775005677361, 12.6855459844296, 6.00640683578153,
6.77059148752228, 6.98700091728789, 6.97832140969228, 3.90415474109624,
4.39265916563698, 3.64629326103298, 5.08935311719625)), .Names = c("fQtrYear",
"Sales", "xReg1", "xReg2"), row.names = c(NA, -40L), class = "data.frame")
Example data:
head(df_noissues)
fQtrYear Sales xReg1 xReg2
1 2004.50 5818 61.16667 9.291573
2 2004.75 5979 47.91667 11.073788
3 2005.00 6221 48.58333 8.371579
4 2005.25 6410 51.41667 6.082140
5 2005.50 6401 56.00000 7.804428
6 2005.75 6536 51.84615 9.099873
Running the same code to construct a test/training set & ARIMA model:
TS <- ts(df_noissues[,2:4], start = c(2004,3), end = c(2014,2), frequency=4)
TS.TRAIN <- window(TS, end=2011.4)
TS.TEST <- window(TS, start=2011.5)
# Build an arima model
MODEL <- auto.arima(TS.TRAIN[,'Sales'], xreg=TS.TRAIN[,colnames(TS.TRAIN) %in% c('xReg1', 'xReg2')])
FCAST <- forecast(MODEL, xreg=TS.TEST[,colnames(TS.TEST) %in% c('xReg1', 'xReg2')])
Extracting the 1-step forecasts: no error.
refit <- Arima(TS[,'Sales'], model=MODEL, xreg=TS[,colnames(TS) %in% c('xReg1', 'xReg2')])
## ARIMA(2,0,0)(0,1,0)[4]
Other than a difference in the model generated (with / without drift), I can't really seem to grasp what might be causing this. Running auto.arima with allowdrift=FALSE indeed seems to resolve the issue.
Error in obtaining one-step forecasts from auto.arima generated drift model (forecast package)
Aucun commentaire:
Enregistrer un commentaire