The Value of Construction Put in Place Survey (VIP) provides monthly estimates of the total dollar value of construction work done in the U.S. The survey covers construction work done each month on new structures or improvements to existing structures for private and public sectors. Data estimates include the cost of labor and materials, cost of architectural and engineering work, overhead costs, interest and taxes paid during construction, and contractor’s profits. Data collection and estimation activities begin on the first day after the reference month and continue for about three weeks. Reported data and estimates are for activity taking place during the previous calendar month. The survey periods in this analysis covers January 1993 to October 2016. Construction represents ~8% of US GDP and is a very closely watched indicator. After initial exploratory analysis and attempt will be made to forecast the Total number.
Load libraries required for the analysis
options(warn=-1)
library(ggplot2)
library(reshape2)
library(urca)
library(seasonal)
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(gridBase)
library(forecast)
## Loading required package: timeDate
## This is forecast 7.3
library(zoo)
The initial plot shows construction spend over time with the average spend. The barchart is just another view of the same data. The line plot shows seasonality in the data.
setwd("C:/Users/dhong/Documents/R")
con<-scan("totdata.txt")
con<-ts(con, start=c(1993,1),freq=12)
plot(con,main="Total Construction Spending", ylab="Millions of Dollars")
abline(h=71535.10,col='blue',lwd=2)
summary(con)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 30260 57220 72110 71540 84340 109600
sd(con)
## [1] 19140.47
df<-read.table("cons_annual.txt", header = FALSE)
df
## V1 V2
## 1 1993 485549
## 2 1994 531890
## 3 1995 548667
## 4 1996 599694
## 5 1997 631849
## 6 1998 688520
## 7 1999 744551
## 8 2000 802758
## 9 2001 840247
## 10 2002 847877
## 11 2003 891498
## 12 2004 991357
## 13 2005 1116811
## 14 2006 1161282
## 15 2007 1147953
## 16 2008 1077351
## 17 2009 906544
## 18 2010 809254
## 19 2011 788331
## 20 2012 850456
## 21 2013 906351
## 22 2014 1005629
## 23 2015 1112433
## 24 2016 972188
p <- ggplot(df, aes(V1, V2, group = 1)) + geom_bar(stat = "identity") + labs(x = "Year", y = "Spend $M", title = "Total Construction Spending")
p
Decomposition of the time series shows the observed data broken down by random, seasonal and trend. Seasonal adjustment is made and unit root test is performed. First line chart is seasonally adjusted construction spend and the second plot shows the comparison between seasonally adjusted and non-seasonally adjusted data. First differences look stationary and similar to noise.
dec_con<-decompose(con)
plot(dec_con)
con_sa<-con-dec_con$seasonal
plot(con_sa)
df1<-read.table("sa.txt", header = FALSE)
df1
## V1 V2 V3
## 1 Jan 84195 71447
## 2 Feb 84641 71159
## 3 Mar 86831 79751
## 4 Apr 90725 88308
## 5 May 92835 94756
## 6 Jun 95831 102716
## 7 Jul 97129 104958
## 8 Aug 97083 106856
## 9 Sep 98663 106868
## 10 Oct 97063 103844
## 11 Nov 93861 94876
## 12 Dec 93575 86894
qplot(V2, V3, data = df1, geom = "line",
xlab = "Seasonally Adjusted", ylab = "Non-Adjusted Construction Spend",
main = "Construction Spending Comparison")
plot(diff(con_sa))
The function acf computes estimates of the autocovariance or autocorrelation function. Function pacf is the function used for the partial autocorrelations. A further look at acf and pacf shows the sample autocorrelations are close to 1, the first partial autocorrelation is also close to 1 but the others are not significant.
par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(con_sa, main="Total Puplic Construction Spending (SA)")
pacf(con_sa)
acz <- acf(con_sa, plot=F)
acd <- data.frame(lag=acz$lag, acf=acz$acf)
ggplot(acd, aes(lag, acf)) + geom_area(fill="grey") +
geom_hline(yintercept=c(0.05, -0.05), linetype="dashed") +
theme_bw()
Significant but small autocorrelation
acf(diff(log(con_sa)))
pacf(diff(log(con_sa)))
Dickey-Fuller test of the null hypothesis to see whether a unit root is present in an autoregressive model.
test <- ur.df(con_sa,type=c("trend"),lags=3,selectlags="AIC")
summary(test)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4391.5 -914.7 31.5 944.9 4250.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 852.737805 424.605420 2.008 0.0456 *
## z.lag.1 -0.013249 0.007814 -1.695 0.0911 .
## tt 1.717111 1.622346 1.058 0.2908
## z.diff.lag1 0.052140 0.057870 0.901 0.3684
## z.diff.lag2 0.257464 0.057516 4.476 1.11e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1480 on 277 degrees of freedom
## Multiple R-squared: 0.07744, Adjusted R-squared: 0.06412
## F-statistic: 5.813 on 4 and 277 DF, p-value: 0.0001663
##
##
## Value of test-statistic is: -1.6955 1.9643 1.484
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.98 -3.42 -3.13
## phi2 6.15 4.71 4.05
## phi3 8.34 6.30 5.36
Compare construction spend from October 1993 to September 2016 and the log
con<-con[10:285]
con<-ts(con, start=c(1993,10),freq=12)
con
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct
## 1993 45361
## 1994 34917 33222 38215 41754 45711 49032 49545 51516 51718 49135
## 1995 37206 35250 40016 42824 46615 49804 50219 52628 53028 51372
## 1996 39356 37391 42052 46865 51598 54703 55430 57488 58544 57456
## 1997 41882 40788 45704 48998 53617 56977 58826 60682 61323 60053
## 1998 44409 43150 49602 54017 57723 64393 64381 66137 66544 64600
## 1999 48746 48824 55607 58654 62209 67545 68522 69975 70034 68669
## 2000 53782 53993 61295 64524 69253 72715 71774 76132 75153 73550
## 2001 56855 55529 62808 67787 72920 78004 78281 79916 76605 76302
## 2002 59516 58588 63782 69504 73384 77182 78863 79460 76542 75710
## 2003 59877 58526 64506 69638 74473 80377 82971 85191 83841 83133
## 2004 64934 64138 73238 78354 83736 89932 93614 96164 92538 90582
## 2005 72458 73094 81791 88032 93704 100678 103875 107453 104682 104039
## 2006 82400 82381 92354 97056 101862 106777 107150 108598 103102 100721
## 2007 79009 78501 87421 93644 99690 105020 106779 109573 104744 104313
## 2008 78039 77921 83384 89092 93316 97882 100234 100366 97303 96609
## 2009 67301 67030 71985 75043 76661 82089 83885 84426 82037 79952
## 2010 55362 54986 60990 66565 68903 74806 73918 76554 75818 73386
## 2011 50973 51017 57148 61590 65430 72495 72253 76986 75871 73783
## 2012 56608 56994 61803 67002 72228 77580 78305 81152 79404 80287
## 2013 58821 58898 64190 70601 75775 80997 84346 86776 85825 86551
## 2014 67391 66916 73899 80749 85599 90712 92585 93261 93193 94888
## 2015 71447 71159 79751 88308 94756 102716 104958 106856 106868 103844
## 2016 78610 79903 88984 92208 97620 105130 106633 109097 107702
## Nov Dec
## 1993 44786 39535
## 1994 46719 40406
## 1995 48163 41542
## 1996 53498 45313
## 1997 54968 48031
## 1998 60469 53095
## 1999 66684 59082
## 2000 69677 60910
## 2001 71543 63697
## 2002 71362 63984
## 2003 77915 71050
## 2004 86394 77733
## 2005 98348 88657
## 2006 93850 85031
## 2007 94934 84325
## 2008 86093 77112
## 2009 71527 64608
## 2010 67318 60648
## 2011 68203 62582
## 2012 73071 66022
## 2013 79695 73876
## 2014 86262 80174
## 2015 94876 86894
## 2016
plot(con,main= "Total Construction Spending")
lncon<-log(con)
plot(lncon,main= "Total Construction Spending")
Model 1 with trend, show linear trend. Notice the redsidual plot confirming seasonality.
t<-(1:length(lncon))
t2<-t^2
model1<-lm(lncon~t+t2)
summary(model1)
##
## Call:
## lm(formula = lncon ~ t + t2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.50306 -0.11809 0.02093 0.11543 0.35039
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.061e+01 3.087e-02 343.70 <2e-16 ***
## t 7.215e-03 5.147e-04 14.02 <2e-16 ***
## t2 -1.780e-05 1.799e-06 -9.89 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1697 on 273 degrees of freedom
## Multiple R-squared: 0.6034, Adjusted R-squared: 0.6005
## F-statistic: 207.7 on 2 and 273 DF, p-value: < 2.2e-16
dwtest(model1)
##
## Durbin-Watson test
##
## data: model1
## DW = 0.17321, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0
plot(lncon, ylim=c(7.8,12),main="Total Construction Spending")
trend<-fitted(model1)
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)
par(new=T)
residuals<-lncon-trend
plot(residuals,ylim=c(-0.35,1.5),ylab='',axes=F,main="Total Construction Spending")
axis(4, pretty(c(-0.3,0.3)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)
model1<-lm(lncon~t)
plot(lncon, ylim=c(8.5,12),xlim=c(2010,2016),main="Trend")
trend<-fitted(model1)
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)
par(new=T)
residuals<-lncon-trend
plot(residuals,ylim=c(-0.5,1.5),ylab='',axes=F,main="Total Construction Spending")
axis(4, pretty(c(-0.4,0.4)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)
par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(residuals)
pacf(residuals)
acz <- acf(residuals, plot=F)
acd <- data.frame(lag=acz$lag, acf=acz$acf)
ggplot(acd, aes(lag, acf)) + geom_area(fill="grey") +
geom_hline(yintercept=c(0.05, -0.05), linetype="dashed") +
theme_bw()
Model 2 with trend and seasonality, appears to be significant. The Durbin-Watson Test now gives a more reliable reading on serial autocorrelation. The plot shows that the residuals are predictable.
M1 = rep(c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M2 = rep(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M3 = rep(c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M4 = rep(c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M5 = rep(c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M6 = rep(c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M7 = rep(c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0), length(lncon)/12)
M8 = rep(c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0), length(lncon)/12)
M9 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), length(lncon)/12)
M10 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), length(lncon)/12)
M11 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0), length(lncon)/12)
M12 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1), length(lncon)/12)
model2<-lm(lncon~0+t+t2+M1+M2+M3+M4+M5+M6+M7+M8+M9+M10+M11+M12)
summary(model2)
##
## Call:
## lm(formula = lncon ~ 0 + t + t2 + M1 + M2 + M3 + M4 + M5 + M6 +
## M7 + M8 + M9 + M10 + M11 + M12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.308409 -0.048814 0.002709 0.065772 0.240218
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## t 7.188e-03 3.757e-04 19.13 <2e-16 ***
## t2 -1.782e-05 1.313e-06 -13.57 <2e-16 ***
## M1 1.071e+01 3.326e-02 322.03 <2e-16 ***
## M2 1.064e+01 3.330e-02 319.40 <2e-16 ***
## M3 1.052e+01 3.334e-02 315.56 <2e-16 ***
## M4 1.042e+01 3.338e-02 312.28 <2e-16 ***
## M5 1.041e+01 3.342e-02 311.50 <2e-16 ***
## M6 1.052e+01 3.345e-02 314.34 <2e-16 ***
## M7 1.059e+01 3.349e-02 316.13 <2e-16 ***
## M8 1.065e+01 3.353e-02 317.66 <2e-16 ***
## M9 1.072e+01 3.356e-02 319.33 <2e-16 ***
## M10 1.073e+01 3.359e-02 319.40 <2e-16 ***
## M11 1.075e+01 3.362e-02 319.86 <2e-16 ***
## M12 1.074e+01 3.366e-02 319.04 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1239 on 262 degrees of freedom
## Multiple R-squared: 0.9999, Adjusted R-squared: 0.9999
## F-statistic: 1.599e+05 on 14 and 262 DF, p-value: < 2.2e-16
dwtest(model2)
##
## Durbin-Watson test
##
## data: model2
## DW = 0.02705, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0
plot(lncon, ylim=c(7.8,12), main="Total Construction Spending")
trend<-fitted(model2)
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)
plot(residuals,ylim=c(-0.35,1.5), ylab='',axes=F,main="Total Construction Spending")
axis(4, pretty(c(-0.3,0.3)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)
par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(residuals)
pacf(residuals)
Model 3 with trend, seasonality and cylcical components. DW ~ 2 which shows it is no longer autocorrelated. Furthermore, the residuals from the acf shows that there are no patterns. The residuals also look to follow a normal distribution.
best.order <- c(0, 0, 0)
best.aic <- Inf
for (q in 0:4) for (p in 0:4) {
fit.aic <- AIC(arima(residuals,order = c(p,0,q),method="ML",optim.control = list(maxit = 1000)))
print(c(p,q,fit.aic))
if (fit.aic < best.aic) {
best.order <- c(p, 0, q)
best.arma <- arima(residuals, order = best.order,method="ML",optim.control = list(maxit = 1000))
best.aic <- fit.aic
}
}
## [1] 0.0000 0.0000 -110.2378
## [1] 1.000 0.000 -683.585
## [1] 2.0000 0.0000 -855.0029
## [1] 3.0000 0.0000 -856.3412
## [1] 4.0000 0.0000 -876.5083
## [1] 0.0000 1.0000 -413.2412
## [1] 1.0000 1.0000 -796.7033
## [1] 2.0000 1.0000 -808.3001
## [1] 3.0000 1.0000 -861.2236
## [1] 4.0000 1.0000 -915.2475
## [1] 0.0000 2.0000 -652.3781
## [1] 1.0000 2.0000 -852.0378
## [1] 2.0000 2.0000 -870.7039
## [1] 3.0000 2.0000 -865.3275
## [1] 4.000 2.000 -869.131
## [1] 0.0000 3.0000 -778.0164
## [1] 1.000 3.000 -879.418
## [1] 2.0000 3.0000 -880.5576
## [1] 3.0000 3.0000 -918.3735
## [1] 4.000 3.000 -891.537
## [1] 0.0000 4.0000 -847.6986
## [1] 1.0000 4.0000 -882.6405
## [1] 2.0000 4.0000 -893.8468
## [1] 3.0000 4.0000 -885.9664
## [1] 4.0000 4.0000 -889.9617
best.order
## [1] 3 0 3
model1und2<-model.matrix(~0+t+t2+M1+M2+M3+M4+M5+M6+M7+M8+M9+M10+M11+M12)
model3<-arima(lncon,order=c(3,0,2),include.mean = FALSE,xreg=model1und2)
model3
##
## Call:
## arima(x = lncon, order = c(3, 0, 2), xreg = model1und2, include.mean = FALSE)
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 t t2 M1
## 0.4285 -0.1354 0.6786 0.4684 0.8790 0.0057 0 10.7337
## s.e. 0.0826 0.0859 0.0824 0.0567 0.0527 0.0013 0 0.1207
## M2 M3 M4 M5 M6 M7 M8
## 10.6592 10.5436 10.4456 10.4307 10.5370 10.6073 10.6688
## s.e. 0.1207 0.1208 0.1208 0.1208 0.1208 0.1208 0.1208
## M9 M10 M11 M12
## 10.7350 10.7469 10.7712 10.7534
## s.e. 0.1208 0.1208 0.1208 0.1207
##
## sigma^2 estimated as 0.0003423: log likelihood = 707.31, aic = -1374.62
model3$coef
## ar1 ar2 ar3 ma1 ma2
## 4.284716e-01 -1.354236e-01 6.786361e-01 4.684147e-01 8.790080e-01
## t t2 M1 M2 M3
## 5.660346e-03 -1.012246e-05 1.073366e+01 1.065920e+01 1.054355e+01
## M4 M5 M6 M7 M8
## 1.044560e+01 1.043067e+01 1.053697e+01 1.060726e+01 1.066880e+01
## M9 M10 M11 M12
## 1.073501e+01 1.074693e+01 1.077120e+01 1.075342e+01
sqrt(diag(model3$var.coef))
## ar1 ar2 ar3 ma1 ma2
## 0.0826378774 0.0859123280 0.0824192487 0.0567053362 0.0527424397
## t t2 M1 M2 M3
## 0.0013328127 0.0000057828 0.1206685094 0.1207228706 0.1207669059
## M4 M5 M6 M7 M8
## 0.1207965514 0.1208224324 0.1208412363 0.1208401406 0.1208339297
## M9 M10 M11 M12
## 0.1208287760 0.1207991927 0.1207565590 0.1207270378
dw<-sum((model3$residuals - lag(model3$residuals))^2, na.rm = TRUE)/sum(model3$residuals^2, na.rm = TRUE)
dw
## [1] 2.08078
plot(lncon,yaxt="n",ylim=c(7.5,10.4),ylab="lncon",main="Total Construction Spending")
trend<-lncon-model3$residuals
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)
axis(2, pretty(c(8,11)))
par(new=T)
residuals<-model3$residuals
plot(residuals,ylim=c(-0.09,0.2),ylab='',axes=F)
axis(4, pretty(c(-0.05,0.05)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)
par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(residuals)
pacf(residuals)
hist(residuals, freq=F, breaks = 30, main="Residual Histogram", ylab="frequency density")
The in-sample forecast for two years takes a shorter period in order to compare the forecast vs. actuals.
lncon2014<-lncon[1:(276-24)]
lncon2014<-ts(lncon2014,frequency=12, start=c(1993,10))
z<-(1:length(lncon2014))
z2=z^2
Mo1 = rep(c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo2 = rep(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo3 = rep(c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo4 = rep(c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo5 = rep(c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo6 = rep(c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo7 = rep(c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo8 = rep(c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0), length(lncon2014)/12)
Mo9 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), length(lncon2014)/12)
Mo10 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), length(lncon2014)/12)
Mo11 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0), length(lncon2014)/12)
Mo12 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1), length(lncon2014)/12)
model1and2.2014 <- model.matrix(~ 0 +z+z2+Mo1+Mo2+Mo3+Mo4+Mo5+Mo6+Mo7+Mo8+Mo9+Mo10+Mo11+Mo12)
model3.2014 <- arima(lncon2014,order = c(1, 0, 1),include.mean = FALSE,xreg=model1and2.2014)
model3.2014
##
## Call:
## arima(x = lncon2014, order = c(1, 0, 1), xreg = model1and2.2014, include.mean = FALSE)
##
## Coefficients:
## ar1 ma1 z z2 Mo1 Mo2 Mo3 Mo4
## 0.9861 -0.1263 0.0068 0 10.7266 10.6541 10.5356 10.4396
## s.e. 0.0087 0.0539 0.0013 0 0.1000 0.1001 0.1001 0.1001
## Mo5 Mo6 Mo7 Mo8 Mo9 Mo10 Mo11
## 10.4231 10.5293 10.6007 10.6627 10.7281 10.7403 10.7659
## s.e. 0.1002 0.1002 0.1002 0.1002 0.1002 0.1002 0.1001
## Mo12
## 10.7471
## s.e. 0.1001
##
## sigma^2 estimated as 0.0003718: log likelihood = 635.8, aic = -1237.61
model3.2014$coef
## ar1 ma1 z z2 Mo1
## 9.860927e-01 -1.263398e-01 6.781594e-03 -1.642343e-05 1.072665e+01
## Mo2 Mo3 Mo4 Mo5 Mo6
## 1.065410e+01 1.053560e+01 1.043957e+01 1.042306e+01 1.052929e+01
## Mo7 Mo8 Mo9 Mo10 Mo11
## 1.060069e+01 1.066267e+01 1.072806e+01 1.074035e+01 1.076594e+01
## Mo12
## 1.074715e+01
sqrt(diag(model3.2014$var.coef))
## ar1 ma1 z z2 Mo1
## 8.742621e-03 5.386675e-02 1.257567e-03 5.911086e-06 9.999583e-02
## Mo2 Mo3 Mo4 Mo5 Mo6
## 1.000532e-01 1.001006e-01 1.001380e-01 1.001655e-01 1.001828e-01
## Mo7 Mo8 Mo9 Mo10 Mo11
## 1.001902e-01 1.001874e-01 1.001744e-01 1.001513e-01 1.001181e-01
## Mo12
## 1.000746e-01
f<-(253:276)
f2<-f^2
FMo1 = rep(c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo2 = rep(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo3 = rep(c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo4 = rep(c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo5 = rep(c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo6 = rep(c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), 24/12)
FMo7 = rep(c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0), 24/12)
FMo8 = rep(c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0), 24/12)
FMo9 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), 24/12)
FMo10 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), 24/12)
FMo11 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0), 24/12)
FMo12 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1), 24/12)
FTrSeas2014.2016<-model.matrix(~0+f+f2+FMo1+FMo2+FMo3+FMo4+FMo5+FMo6+FMo7+FMo8+FMo9+FMo10+FMo11+FMo12)
Forecast<-predict(model3.2014,24,newxreg=FTrSeas2014.2016)$pred
Forecast<-exp(Forecast)
Forecast
## Jan Feb Mar Apr May Jun Jul
## 2014
## 2015 67836.78 66592.55 73906.14 79213.56 84103.37 89597.35 90511.04
## 2016 66106.29 64871.79 71971.92 77114.15 81846.42 87163.12 88021.80
## Aug Sep Oct Nov Dec
## 2014 90923.05 84398.75 74821.90
## 2015 92656.40 90731.83 88693.45 82301.40 72937.90
## 2016 90077.20 88175.83
UF<-predict(model3.2014,24,newxreg=FTrSeas2014.2016)$pred+1.96*predict(model3.2014,24,newxreg=FTrSeas2014.2016)$se
UF<-exp(UF)
UF
## Jan Feb Mar Apr May Jun Jul
## 2014
## 2015 72548.27 71708.99 80073.21 86299.82 92091.74 98565.99 100002.66
## 2016 74578.13 73393.74 81646.93 87705.84 93316.80 99611.72 100818.56
## Aug Sep Oct Nov Dec
## 2014 94425.01 88711.76 79389.08
## 2015 102786.63 101032.42 99113.72 92278.87 82039.19
## 2016 103394.08 101419.51
LF<-predict(model3.2014,24,newxreg=FTrSeas2014.2016)$pred-1.96*predict(model3.2014,24,newxreg=FTrSeas2014.2016)$se
LF<-exp(LF)
plot(con,xlim=c(2012,2016), ylim=c(65000,100000),ylab="Millions of Dollars",main="2 Year In-sample Forecast",lwd=1.5)
trend<-lncon-model3.2014$residuals
trend<-exp(trend)
lines(trend,col="green")
lines(Forecast,col="blue")
lines(UF,col="red")
lines(LF,col="red")
Alternatively, a forecast can be done with Holt-Winters, the graph shows the continuous time series and confidence bands for the prediction.
con1 = read.table("totdata_trans.txt",
sep="\t",
fill=FALSE,
strip.white=TRUE)
head(con1)
## V1 V2
## 1 1993-01-01 31283
## 2 1993-02-01 30264
## 3 1993-03-01 33794
## 4 1993-04-01 37257
## 5 1993-05-01 40124
## 6 1993-06-01 43842
A <- ts(con1$V2, start=c(1993,1), frequency=12)
A
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct
## 1993 31283 30264 33794 37257 40124 43842 44682 46866 47755 45361
## 1994 34917 33222 38215 41754 45711 49032 49545 51516 51718 49135
## 1995 37206 35250 40016 42824 46615 49804 50219 52628 53028 51372
## 1996 39356 37391 42052 46865 51598 54703 55430 57488 58544 57456
## 1997 41882 40788 45704 48998 53617 56977 58826 60682 61323 60053
## 1998 44409 43150 49602 54017 57723 64393 64381 66137 66544 64600
## 1999 48746 48824 55607 58654 62209 67545 68522 69975 70034 68669
## 2000 53782 53993 61295 64524 69253 72715 71774 76132 75153 73550
## 2001 56855 55529 62808 67787 72920 78004 78281 79916 76605 76302
## 2002 59516 58588 63782 69504 73384 77182 78863 79460 76542 75710
## 2003 59877 58526 64506 69638 74473 80377 82971 85191 83841 83133
## 2004 64934 64138 73238 78354 83736 89932 93614 96164 92538 90582
## 2005 72458 73094 81791 88032 93704 100678 103875 107453 104682 104039
## 2006 82400 82381 92354 97056 101862 106777 107150 108598 103102 100721
## 2007 79009 78501 87421 93644 99690 105020 106779 109573 104744 104313
## 2008 78039 77921 83384 89092 93316 97882 100234 100366 97303 96609
## 2009 67301 67030 71985 75043 76661 82089 83885 84426 82037 79952
## 2010 55362 54986 60990 66565 68903 74806 73918 76554 75818 73386
## 2011 50973 51017 57148 61590 65430 72495 72253 76986 75871 73783
## 2012 56608 56994 61803 67002 72228 77580 78305 81152 79404 80287
## 2013 58821 58898 64190 70601 75775 80997 84346 86776 85825 86551
## 2014 67391 66916 73899 80749 85599 90712 92585 93261 93193 94888
## 2015 71447 71159 79751 88308 94756 102716 104958 106856 106868 103844
## 2016 78610 79903 88984 92208 97620 105130 106633 109097 107702 106301
## Nov Dec
## 1993 44786 39535
## 1994 46719 40406
## 1995 48163 41542
## 1996 53498 45313
## 1997 54968 48031
## 1998 60469 53095
## 1999 66684 59082
## 2000 69677 60910
## 2001 71543 63697
## 2002 71362 63984
## 2003 77915 71050
## 2004 86394 77733
## 2005 98348 88657
## 2006 93850 85031
## 2007 94934 84325
## 2008 86093 77112
## 2009 71527 64608
## 2010 67318 60648
## 2011 68203 62582
## 2012 73071 66022
## 2013 79695 73876
## 2014 86262 80174
## 2015 94876 86894
## 2016
class(A)
## [1] "ts"
plot.ts(A)
Ahw <- HoltWinters(A)
class(Ahw)
## [1] "HoltWinters"
Ahw
## Holt-Winters exponential smoothing with trend and additive seasonal component.
##
## Call:
## HoltWinters(x = A)
##
## Smoothing parameters:
## alpha: 0.6007149
## beta : 0.002900526
## gamma: 1
##
## Coefficients:
## [,1]
## a 99038.0956
## b 276.7898
## s1 -598.4259
## s2 -7892.6109
## s3 -16399.0163
## s4 -16678.3738
## s5 -9805.2652
## s6 -5676.1540
## s7 -420.4122
## s8 6050.5999
## s9 7189.1329
## s10 9133.4250
## s11 8295.6227
## s12 7262.9044
Ahw.p <- predict(Ahw, n.ahead=24)
Ahw.p
## Jan Feb Mar Apr May Jun Jul
## 2016
## 2017 83469.45 83466.88 90616.78 95022.68 100555.21 107303.01 108718.34
## 2018 86790.93 86788.36 93938.26 98344.16 103876.69 110624.49 112039.82
## Aug Sep Oct Nov Dec
## 2016 98716.46 91699.06
## 2017 110939.42 110378.41 109622.48 102037.94 95020.54
## 2018 114260.90 113699.88 112943.96
plot(A, xlim=c(1993, 2018))
lines(Ahw.p, col="red")
fhw <- forecast(Ahw, h=24)
fhw
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Nov 2016 98716.46 96335.83 101097.09 95075.60 102357.32
## Dec 2016 91699.06 88919.78 94478.35 87448.51 95949.61
## Jan 2017 83469.45 80340.02 86598.88 78683.39 88255.50
## Feb 2017 83466.88 80020.98 86912.78 78196.83 88736.93
## Mar 2017 90616.78 86879.52 94354.04 84901.14 96332.42
## Apr 2017 95022.68 91013.69 99031.67 88891.46 101153.90
## May 2017 100555.21 96290.37 104820.06 94032.69 107077.73
## Jun 2017 107303.01 102795.47 111810.56 100409.32 114196.71
## Jul 2017 108718.34 103979.24 113457.44 101470.51 115966.17
## Aug 2017 110939.42 105978.33 115900.51 103352.09 118526.75
## Sep 2017 110378.41 105203.67 115553.14 102464.33 118292.48
## Oct 2017 109622.48 104241.44 115003.52 101392.89 117852.07
## Nov 2017 102037.94 96133.49 107942.38 93007.87 111068.01
## Dec 2017 95020.54 88932.46 101108.63 85709.62 104331.46
## Jan 2018 86790.93 80523.60 93058.25 77205.88 96375.97
## Feb 2018 86788.36 80345.82 93230.90 76935.35 96641.37
## Mar 2018 93938.26 87324.21 100552.31 83822.94 104053.58
## Apr 2018 98344.16 91562.02 105126.30 87971.77 108716.55
## May 2018 103876.69 96929.62 110823.76 93252.07 114501.31
## Jun 2018 110624.49 103515.45 117733.53 99752.15 121496.83
## Jul 2018 112039.82 104771.54 119308.09 100923.95 123155.68
## Aug 2018 114260.90 106835.96 121685.83 102905.44 125616.35
## Sep 2018 113699.88 106120.70 121279.07 102108.52 125291.25
## Oct 2018 112943.96 105212.77 120675.14 101120.13 124767.78
class(fhw)
## [1] "forecast"
autoplot(fhw)