Construction Spending Forecast

The Value of Construction Put in Place Survey (VIP) provides monthly estimates of the total dollar value of construction work done in the U.S. The survey covers construction work done each month on new structures or improvements to existing structures for private and public sectors. Data estimates include the cost of labor and materials, cost of architectural and engineering work, overhead costs, interest and taxes paid during construction, and contractor’s profits. Data collection and estimation activities begin on the first day after the reference month and continue for about three weeks. Reported data and estimates are for activity taking place during the previous calendar month. The survey periods in this analysis covers January 1993 to October 2016. Construction represents ~8% of US GDP and is a very closely watched indicator. After initial exploratory analysis and attempt will be made to forecast the Total number.

Load libraries required for the analysis

options(warn=-1)
library(ggplot2)
library(reshape2)
library(urca)
library(seasonal)
library(lmtest)

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(gridBase)
library(forecast)

## Loading required package: timeDate

## This is forecast 7.3

library(zoo)

The initial plot shows construction spend over time with the average spend. The barchart is just another view of the same data. The line plot shows seasonality in the data.

setwd("C:/Users/dhong/Documents/R")
con<-scan("totdata.txt")
con<-ts(con, start=c(1993,1),freq=12)
plot(con,main="Total Construction Spending", ylab="Millions of Dollars")
abline(h=71535.10,col='blue',lwd=2)

summary(con)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   30260   57220   72110   71540   84340  109600

sd(con)

## [1] 19140.47

df<-read.table("cons_annual.txt", header = FALSE)
df

##      V1      V2
## 1  1993  485549
## 2  1994  531890
## 3  1995  548667
## 4  1996  599694
## 5  1997  631849
## 6  1998  688520
## 7  1999  744551
## 8  2000  802758
## 9  2001  840247
## 10 2002  847877
## 11 2003  891498
## 12 2004  991357
## 13 2005 1116811
## 14 2006 1161282
## 15 2007 1147953
## 16 2008 1077351
## 17 2009  906544
## 18 2010  809254
## 19 2011  788331
## 20 2012  850456
## 21 2013  906351
## 22 2014 1005629
## 23 2015 1112433
## 24 2016  972188

p <- ggplot(df, aes(V1, V2, group = 1)) + geom_bar(stat = "identity") + labs(x = "Year", y = "Spend $M", title = "Total Construction Spending")
p

Decomposition of the time series shows the observed data broken down by random, seasonal and trend. Seasonal adjustment is made and unit root test is performed. First line chart is seasonally adjusted construction spend and the second plot shows the comparison between seasonally adjusted and non-seasonally adjusted data. First differences look stationary and similar to noise.

dec_con<-decompose(con)
plot(dec_con)

con_sa<-con-dec_con$seasonal
plot(con_sa)

df1<-read.table("sa.txt", header = FALSE)
df1

##     V1    V2     V3
## 1  Jan 84195  71447
## 2  Feb 84641  71159
## 3  Mar 86831  79751
## 4  Apr 90725  88308
## 5  May 92835  94756
## 6  Jun 95831 102716
## 7  Jul 97129 104958
## 8  Aug 97083 106856
## 9  Sep 98663 106868
## 10 Oct 97063 103844
## 11 Nov 93861  94876
## 12 Dec 93575  86894

qplot(V2, V3, data = df1, geom = "line",
    xlab = "Seasonally Adjusted", ylab = "Non-Adjusted Construction Spend",
    main = "Construction Spending Comparison")

plot(diff(con_sa))

The function acf computes estimates of the autocovariance or autocorrelation function. Function pacf is the function used for the partial autocorrelations. A further look at acf and pacf shows the sample autocorrelations are close to 1, the first partial autocorrelation is also close to 1 but the others are not significant.

par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(con_sa, main="Total Puplic Construction Spending (SA)")
pacf(con_sa)

acz <- acf(con_sa, plot=F)
acd <- data.frame(lag=acz$lag, acf=acz$acf)
ggplot(acd, aes(lag, acf)) + geom_area(fill="grey") +
  geom_hline(yintercept=c(0.05, -0.05), linetype="dashed") +
  theme_bw()

Significant but small autocorrelation

acf(diff(log(con_sa)))

pacf(diff(log(con_sa)))

Dickey-Fuller test of the null hypothesis to see whether a unit root is present in an autoregressive model.

test <- ur.df(con_sa,type=c("trend"),lags=3,selectlags="AIC")
summary(test)

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4391.5  -914.7    31.5   944.9  4250.7 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 852.737805 424.605420   2.008   0.0456 *  
## z.lag.1      -0.013249   0.007814  -1.695   0.0911 .  
## tt            1.717111   1.622346   1.058   0.2908    
## z.diff.lag1   0.052140   0.057870   0.901   0.3684    
## z.diff.lag2   0.257464   0.057516   4.476 1.11e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1480 on 277 degrees of freedom
## Multiple R-squared:  0.07744,    Adjusted R-squared:  0.06412 
## F-statistic: 5.813 on 4 and 277 DF,  p-value: 0.0001663
## 
## 
## Value of test-statistic is: -1.6955 1.9643 1.484 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.98 -3.42 -3.13
## phi2  6.15  4.71  4.05
## phi3  8.34  6.30  5.36

Compare construction spend from October 1993 to September 2016 and the log

con<-con[10:285]
con<-ts(con, start=c(1993,10),freq=12)
con

##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1993                                                                 45361
## 1994  34917  33222  38215  41754  45711  49032  49545  51516  51718  49135
## 1995  37206  35250  40016  42824  46615  49804  50219  52628  53028  51372
## 1996  39356  37391  42052  46865  51598  54703  55430  57488  58544  57456
## 1997  41882  40788  45704  48998  53617  56977  58826  60682  61323  60053
## 1998  44409  43150  49602  54017  57723  64393  64381  66137  66544  64600
## 1999  48746  48824  55607  58654  62209  67545  68522  69975  70034  68669
## 2000  53782  53993  61295  64524  69253  72715  71774  76132  75153  73550
## 2001  56855  55529  62808  67787  72920  78004  78281  79916  76605  76302
## 2002  59516  58588  63782  69504  73384  77182  78863  79460  76542  75710
## 2003  59877  58526  64506  69638  74473  80377  82971  85191  83841  83133
## 2004  64934  64138  73238  78354  83736  89932  93614  96164  92538  90582
## 2005  72458  73094  81791  88032  93704 100678 103875 107453 104682 104039
## 2006  82400  82381  92354  97056 101862 106777 107150 108598 103102 100721
## 2007  79009  78501  87421  93644  99690 105020 106779 109573 104744 104313
## 2008  78039  77921  83384  89092  93316  97882 100234 100366  97303  96609
## 2009  67301  67030  71985  75043  76661  82089  83885  84426  82037  79952
## 2010  55362  54986  60990  66565  68903  74806  73918  76554  75818  73386
## 2011  50973  51017  57148  61590  65430  72495  72253  76986  75871  73783
## 2012  56608  56994  61803  67002  72228  77580  78305  81152  79404  80287
## 2013  58821  58898  64190  70601  75775  80997  84346  86776  85825  86551
## 2014  67391  66916  73899  80749  85599  90712  92585  93261  93193  94888
## 2015  71447  71159  79751  88308  94756 102716 104958 106856 106868 103844
## 2016  78610  79903  88984  92208  97620 105130 106633 109097 107702       
##         Nov    Dec
## 1993  44786  39535
## 1994  46719  40406
## 1995  48163  41542
## 1996  53498  45313
## 1997  54968  48031
## 1998  60469  53095
## 1999  66684  59082
## 2000  69677  60910
## 2001  71543  63697
## 2002  71362  63984
## 2003  77915  71050
## 2004  86394  77733
## 2005  98348  88657
## 2006  93850  85031
## 2007  94934  84325
## 2008  86093  77112
## 2009  71527  64608
## 2010  67318  60648
## 2011  68203  62582
## 2012  73071  66022
## 2013  79695  73876
## 2014  86262  80174
## 2015  94876  86894
## 2016

plot(con,main= "Total Construction Spending")

lncon<-log(con)
plot(lncon,main= "Total Construction Spending")

Model 1 with trend, show linear trend. Notice the redsidual plot confirming seasonality.

t<-(1:length(lncon))
t2<-t^2
model1<-lm(lncon~t+t2)
summary(model1)

## 
## Call:
## lm(formula = lncon ~ t + t2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.50306 -0.11809  0.02093  0.11543  0.35039 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.061e+01  3.087e-02  343.70   <2e-16 ***
## t            7.215e-03  5.147e-04   14.02   <2e-16 ***
## t2          -1.780e-05  1.799e-06   -9.89   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1697 on 273 degrees of freedom
## Multiple R-squared:  0.6034, Adjusted R-squared:  0.6005 
## F-statistic: 207.7 on 2 and 273 DF,  p-value: < 2.2e-16

dwtest(model1)

## 
##  Durbin-Watson test
## 
## data:  model1
## DW = 0.17321, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0

plot(lncon, ylim=c(7.8,12),main="Total Construction Spending")
trend<-fitted(model1)
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)
par(new=T)
residuals<-lncon-trend
plot(residuals,ylim=c(-0.35,1.5),ylab='',axes=F,main="Total Construction Spending")
axis(4, pretty(c(-0.3,0.3)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)

model1<-lm(lncon~t)

plot(lncon, ylim=c(8.5,12),xlim=c(2010,2016),main="Trend")
trend<-fitted(model1)
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)
par(new=T)
residuals<-lncon-trend
plot(residuals,ylim=c(-0.5,1.5),ylab='',axes=F,main="Total Construction Spending")
axis(4, pretty(c(-0.4,0.4)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)

par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(residuals)
pacf(residuals)

acz <- acf(residuals, plot=F)
acd <- data.frame(lag=acz$lag, acf=acz$acf)
ggplot(acd, aes(lag, acf)) + geom_area(fill="grey") +
  geom_hline(yintercept=c(0.05, -0.05), linetype="dashed") +
  theme_bw()

Model 2 with trend and seasonality, appears to be significant. The Durbin-Watson Test now gives a more reliable reading on serial autocorrelation. The plot shows that the residuals are predictable.

M1 = rep(c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M2 = rep(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M3 = rep(c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M4 = rep(c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M5 = rep(c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M6 = rep(c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), length(lncon)/12)
M7 = rep(c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0), length(lncon)/12)
M8 = rep(c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0), length(lncon)/12)
M9 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), length(lncon)/12)
M10 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), length(lncon)/12)
M11 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0), length(lncon)/12)
M12 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1), length(lncon)/12)

model2<-lm(lncon~0+t+t2+M1+M2+M3+M4+M5+M6+M7+M8+M9+M10+M11+M12)
summary(model2)

## 
## Call:
## lm(formula = lncon ~ 0 + t + t2 + M1 + M2 + M3 + M4 + M5 + M6 + 
##     M7 + M8 + M9 + M10 + M11 + M12)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.308409 -0.048814  0.002709  0.065772  0.240218 
## 
## Coefficients:
##       Estimate Std. Error t value Pr(>|t|)    
## t    7.188e-03  3.757e-04   19.13   <2e-16 ***
## t2  -1.782e-05  1.313e-06  -13.57   <2e-16 ***
## M1   1.071e+01  3.326e-02  322.03   <2e-16 ***
## M2   1.064e+01  3.330e-02  319.40   <2e-16 ***
## M3   1.052e+01  3.334e-02  315.56   <2e-16 ***
## M4   1.042e+01  3.338e-02  312.28   <2e-16 ***
## M5   1.041e+01  3.342e-02  311.50   <2e-16 ***
## M6   1.052e+01  3.345e-02  314.34   <2e-16 ***
## M7   1.059e+01  3.349e-02  316.13   <2e-16 ***
## M8   1.065e+01  3.353e-02  317.66   <2e-16 ***
## M9   1.072e+01  3.356e-02  319.33   <2e-16 ***
## M10  1.073e+01  3.359e-02  319.40   <2e-16 ***
## M11  1.075e+01  3.362e-02  319.86   <2e-16 ***
## M12  1.074e+01  3.366e-02  319.04   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1239 on 262 degrees of freedom
## Multiple R-squared:  0.9999, Adjusted R-squared:  0.9999 
## F-statistic: 1.599e+05 on 14 and 262 DF,  p-value: < 2.2e-16

dwtest(model2)

## 
##  Durbin-Watson test
## 
## data:  model2
## DW = 0.02705, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0

plot(lncon, ylim=c(7.8,12), main="Total Construction Spending")
trend<-fitted(model2)
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)

plot(residuals,ylim=c(-0.35,1.5), ylab='',axes=F,main="Total Construction Spending")
axis(4, pretty(c(-0.3,0.3)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)

par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(residuals)
pacf(residuals)

Model 3 with trend, seasonality and cylcical components. DW ~ 2 which shows it is no longer autocorrelated. Furthermore, the residuals from the acf shows that there are no patterns. The residuals also look to follow a normal distribution.

best.order <- c(0, 0, 0)
best.aic <- Inf
for (q in 0:4) for (p in 0:4) {
  fit.aic <- AIC(arima(residuals,order = c(p,0,q),method="ML",optim.control = list(maxit = 1000)))
  print(c(p,q,fit.aic))
  if (fit.aic < best.aic) {
    best.order <- c(p, 0, q)
    best.arma <- arima(residuals, order = best.order,method="ML",optim.control = list(maxit = 1000))
    best.aic <- fit.aic
  }
}

## [1]    0.0000    0.0000 -110.2378
## [1]    1.000    0.000 -683.585
## [1]    2.0000    0.0000 -855.0029
## [1]    3.0000    0.0000 -856.3412
## [1]    4.0000    0.0000 -876.5083
## [1]    0.0000    1.0000 -413.2412
## [1]    1.0000    1.0000 -796.7033
## [1]    2.0000    1.0000 -808.3001
## [1]    3.0000    1.0000 -861.2236
## [1]    4.0000    1.0000 -915.2475
## [1]    0.0000    2.0000 -652.3781
## [1]    1.0000    2.0000 -852.0378
## [1]    2.0000    2.0000 -870.7039
## [1]    3.0000    2.0000 -865.3275
## [1]    4.000    2.000 -869.131
## [1]    0.0000    3.0000 -778.0164
## [1]    1.000    3.000 -879.418
## [1]    2.0000    3.0000 -880.5576
## [1]    3.0000    3.0000 -918.3735
## [1]    4.000    3.000 -891.537
## [1]    0.0000    4.0000 -847.6986
## [1]    1.0000    4.0000 -882.6405
## [1]    2.0000    4.0000 -893.8468
## [1]    3.0000    4.0000 -885.9664
## [1]    4.0000    4.0000 -889.9617

best.order

## [1] 3 0 3

model1und2<-model.matrix(~0+t+t2+M1+M2+M3+M4+M5+M6+M7+M8+M9+M10+M11+M12)
model3<-arima(lncon,order=c(3,0,2),include.mean = FALSE,xreg=model1und2)
model3

## 
## Call:
## arima(x = lncon, order = c(3, 0, 2), xreg = model1und2, include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2     ar3     ma1     ma2       t  t2       M1
##       0.4285  -0.1354  0.6786  0.4684  0.8790  0.0057   0  10.7337
## s.e.  0.0826   0.0859  0.0824  0.0567  0.0527  0.0013   0   0.1207
##            M2       M3       M4       M5       M6       M7       M8
##       10.6592  10.5436  10.4456  10.4307  10.5370  10.6073  10.6688
## s.e.   0.1207   0.1208   0.1208   0.1208   0.1208   0.1208   0.1208
##            M9      M10      M11      M12
##       10.7350  10.7469  10.7712  10.7534
## s.e.   0.1208   0.1208   0.1208   0.1207
## 
## sigma^2 estimated as 0.0003423:  log likelihood = 707.31,  aic = -1374.62

model3$coef

##           ar1           ar2           ar3           ma1           ma2 
##  4.284716e-01 -1.354236e-01  6.786361e-01  4.684147e-01  8.790080e-01 
##             t            t2            M1            M2            M3 
##  5.660346e-03 -1.012246e-05  1.073366e+01  1.065920e+01  1.054355e+01 
##            M4            M5            M6            M7            M8 
##  1.044560e+01  1.043067e+01  1.053697e+01  1.060726e+01  1.066880e+01 
##            M9           M10           M11           M12 
##  1.073501e+01  1.074693e+01  1.077120e+01  1.075342e+01

sqrt(diag(model3$var.coef))

##          ar1          ar2          ar3          ma1          ma2 
## 0.0826378774 0.0859123280 0.0824192487 0.0567053362 0.0527424397 
##            t           t2           M1           M2           M3 
## 0.0013328127 0.0000057828 0.1206685094 0.1207228706 0.1207669059 
##           M4           M5           M6           M7           M8 
## 0.1207965514 0.1208224324 0.1208412363 0.1208401406 0.1208339297 
##           M9          M10          M11          M12 
## 0.1208287760 0.1207991927 0.1207565590 0.1207270378

dw<-sum((model3$residuals - lag(model3$residuals))^2, na.rm = TRUE)/sum(model3$residuals^2, na.rm = TRUE)
dw

## [1] 2.08078

plot(lncon,yaxt="n",ylim=c(7.5,10.4),ylab="lncon",main="Total Construction Spending")
trend<-lncon-model3$residuals
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)
axis(2, pretty(c(8,11)))
par(new=T)
residuals<-model3$residuals
plot(residuals,ylim=c(-0.09,0.2),ylab='',axes=F)
axis(4, pretty(c(-0.05,0.05)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)

par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(residuals)
pacf(residuals)

hist(residuals, freq=F, breaks = 30, main="Residual Histogram", ylab="frequency density")

The in-sample forecast for two years takes a shorter period in order to compare the forecast vs. actuals.

lncon2014<-lncon[1:(276-24)]
lncon2014<-ts(lncon2014,frequency=12, start=c(1993,10))
z<-(1:length(lncon2014))
z2=z^2
Mo1 = rep(c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo2 = rep(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo3 = rep(c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo4 = rep(c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo5 = rep(c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo6 = rep(c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo7 = rep(c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0), length(lncon2014)/12)
Mo8 = rep(c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0), length(lncon2014)/12)
Mo9 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), length(lncon2014)/12)
Mo10 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), length(lncon2014)/12)
Mo11 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0), length(lncon2014)/12)
Mo12 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1), length(lncon2014)/12)
model1and2.2014 <- model.matrix(~ 0 +z+z2+Mo1+Mo2+Mo3+Mo4+Mo5+Mo6+Mo7+Mo8+Mo9+Mo10+Mo11+Mo12)
model3.2014 <- arima(lncon2014,order = c(1, 0, 1),include.mean = FALSE,xreg=model1and2.2014)
model3.2014

## 
## Call:
## arima(x = lncon2014, order = c(1, 0, 1), xreg = model1and2.2014, include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1       z  z2      Mo1      Mo2      Mo3      Mo4
##       0.9861  -0.1263  0.0068   0  10.7266  10.6541  10.5356  10.4396
## s.e.  0.0087   0.0539  0.0013   0   0.1000   0.1001   0.1001   0.1001
##           Mo5      Mo6      Mo7      Mo8      Mo9     Mo10     Mo11
##       10.4231  10.5293  10.6007  10.6627  10.7281  10.7403  10.7659
## s.e.   0.1002   0.1002   0.1002   0.1002   0.1002   0.1002   0.1001
##          Mo12
##       10.7471
## s.e.   0.1001
## 
## sigma^2 estimated as 0.0003718:  log likelihood = 635.8,  aic = -1237.61

model3.2014$coef

##           ar1           ma1             z            z2           Mo1 
##  9.860927e-01 -1.263398e-01  6.781594e-03 -1.642343e-05  1.072665e+01 
##           Mo2           Mo3           Mo4           Mo5           Mo6 
##  1.065410e+01  1.053560e+01  1.043957e+01  1.042306e+01  1.052929e+01 
##           Mo7           Mo8           Mo9          Mo10          Mo11 
##  1.060069e+01  1.066267e+01  1.072806e+01  1.074035e+01  1.076594e+01 
##          Mo12 
##  1.074715e+01

sqrt(diag(model3.2014$var.coef))

##          ar1          ma1            z           z2          Mo1 
## 8.742621e-03 5.386675e-02 1.257567e-03 5.911086e-06 9.999583e-02 
##          Mo2          Mo3          Mo4          Mo5          Mo6 
## 1.000532e-01 1.001006e-01 1.001380e-01 1.001655e-01 1.001828e-01 
##          Mo7          Mo8          Mo9         Mo10         Mo11 
## 1.001902e-01 1.001874e-01 1.001744e-01 1.001513e-01 1.001181e-01 
##         Mo12 
## 1.000746e-01

f<-(253:276)
f2<-f^2
FMo1 = rep(c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo2 = rep(c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo3 = rep(c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo4 = rep(c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo5 = rep(c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0), 24/12)
FMo6 = rep(c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), 24/12)
FMo7 = rep(c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0), 24/12)
FMo8 = rep(c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0), 24/12)
FMo9 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), 24/12)
FMo10 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), 24/12)
FMo11 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0), 24/12)
FMo12 = rep(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1), 24/12)
FTrSeas2014.2016<-model.matrix(~0+f+f2+FMo1+FMo2+FMo3+FMo4+FMo5+FMo6+FMo7+FMo8+FMo9+FMo10+FMo11+FMo12)

Forecast<-predict(model3.2014,24,newxreg=FTrSeas2014.2016)$pred
Forecast<-exp(Forecast)
Forecast

##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 2014                                                               
## 2015 67836.78 66592.55 73906.14 79213.56 84103.37 89597.35 90511.04
## 2016 66106.29 64871.79 71971.92 77114.15 81846.42 87163.12 88021.80
##           Aug      Sep      Oct      Nov      Dec
## 2014                   90923.05 84398.75 74821.90
## 2015 92656.40 90731.83 88693.45 82301.40 72937.90
## 2016 90077.20 88175.83

UF<-predict(model3.2014,24,newxreg=FTrSeas2014.2016)$pred+1.96*predict(model3.2014,24,newxreg=FTrSeas2014.2016)$se
UF<-exp(UF)
UF

##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 2014                                                                      
## 2015  72548.27  71708.99  80073.21  86299.82  92091.74  98565.99 100002.66
## 2016  74578.13  73393.74  81646.93  87705.84  93316.80  99611.72 100818.56
##            Aug       Sep       Oct       Nov       Dec
## 2014                      94425.01  88711.76  79389.08
## 2015 102786.63 101032.42  99113.72  92278.87  82039.19
## 2016 103394.08 101419.51

LF<-predict(model3.2014,24,newxreg=FTrSeas2014.2016)$pred-1.96*predict(model3.2014,24,newxreg=FTrSeas2014.2016)$se
LF<-exp(LF)


plot(con,xlim=c(2012,2016), ylim=c(65000,100000),ylab="Millions of Dollars",main="2 Year In-sample Forecast",lwd=1.5)
trend<-lncon-model3.2014$residuals
trend<-exp(trend)
lines(trend,col="green")
lines(Forecast,col="blue")
lines(UF,col="red")
lines(LF,col="red")

Alternatively, a forecast can be done with Holt-Winters, the graph shows the continuous time series and confidence bands for the prediction.

con1 = read.table("totdata_trans.txt", 
               sep="\t",
               fill=FALSE, 
               strip.white=TRUE)

head(con1)

##           V1    V2
## 1 1993-01-01 31283
## 2 1993-02-01 30264
## 3 1993-03-01 33794
## 4 1993-04-01 37257
## 5 1993-05-01 40124
## 6 1993-06-01 43842

A <- ts(con1$V2, start=c(1993,1), frequency=12)
A

##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1993  31283  30264  33794  37257  40124  43842  44682  46866  47755  45361
## 1994  34917  33222  38215  41754  45711  49032  49545  51516  51718  49135
## 1995  37206  35250  40016  42824  46615  49804  50219  52628  53028  51372
## 1996  39356  37391  42052  46865  51598  54703  55430  57488  58544  57456
## 1997  41882  40788  45704  48998  53617  56977  58826  60682  61323  60053
## 1998  44409  43150  49602  54017  57723  64393  64381  66137  66544  64600
## 1999  48746  48824  55607  58654  62209  67545  68522  69975  70034  68669
## 2000  53782  53993  61295  64524  69253  72715  71774  76132  75153  73550
## 2001  56855  55529  62808  67787  72920  78004  78281  79916  76605  76302
## 2002  59516  58588  63782  69504  73384  77182  78863  79460  76542  75710
## 2003  59877  58526  64506  69638  74473  80377  82971  85191  83841  83133
## 2004  64934  64138  73238  78354  83736  89932  93614  96164  92538  90582
## 2005  72458  73094  81791  88032  93704 100678 103875 107453 104682 104039
## 2006  82400  82381  92354  97056 101862 106777 107150 108598 103102 100721
## 2007  79009  78501  87421  93644  99690 105020 106779 109573 104744 104313
## 2008  78039  77921  83384  89092  93316  97882 100234 100366  97303  96609
## 2009  67301  67030  71985  75043  76661  82089  83885  84426  82037  79952
## 2010  55362  54986  60990  66565  68903  74806  73918  76554  75818  73386
## 2011  50973  51017  57148  61590  65430  72495  72253  76986  75871  73783
## 2012  56608  56994  61803  67002  72228  77580  78305  81152  79404  80287
## 2013  58821  58898  64190  70601  75775  80997  84346  86776  85825  86551
## 2014  67391  66916  73899  80749  85599  90712  92585  93261  93193  94888
## 2015  71447  71159  79751  88308  94756 102716 104958 106856 106868 103844
## 2016  78610  79903  88984  92208  97620 105130 106633 109097 107702 106301
##         Nov    Dec
## 1993  44786  39535
## 1994  46719  40406
## 1995  48163  41542
## 1996  53498  45313
## 1997  54968  48031
## 1998  60469  53095
## 1999  66684  59082
## 2000  69677  60910
## 2001  71543  63697
## 2002  71362  63984
## 2003  77915  71050
## 2004  86394  77733
## 2005  98348  88657
## 2006  93850  85031
## 2007  94934  84325
## 2008  86093  77112
## 2009  71527  64608
## 2010  67318  60648
## 2011  68203  62582
## 2012  73071  66022
## 2013  79695  73876
## 2014  86262  80174
## 2015  94876  86894
## 2016

class(A)

## [1] "ts"

plot.ts(A)

Ahw <- HoltWinters(A)
class(Ahw)

## [1] "HoltWinters"

Ahw

## Holt-Winters exponential smoothing with trend and additive seasonal component.
## 
## Call:
## HoltWinters(x = A)
## 
## Smoothing parameters:
##  alpha: 0.6007149
##  beta : 0.002900526
##  gamma: 1
## 
## Coefficients:
##            [,1]
## a    99038.0956
## b      276.7898
## s1    -598.4259
## s2   -7892.6109
## s3  -16399.0163
## s4  -16678.3738
## s5   -9805.2652
## s6   -5676.1540
## s7    -420.4122
## s8    6050.5999
## s9    7189.1329
## s10   9133.4250
## s11   8295.6227
## s12   7262.9044

Ahw.p <- predict(Ahw, n.ahead=24) 
Ahw.p

##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 2016                                                                      
## 2017  83469.45  83466.88  90616.78  95022.68 100555.21 107303.01 108718.34
## 2018  86790.93  86788.36  93938.26  98344.16 103876.69 110624.49 112039.82
##            Aug       Sep       Oct       Nov       Dec
## 2016                                98716.46  91699.06
## 2017 110939.42 110378.41 109622.48 102037.94  95020.54
## 2018 114260.90 113699.88 112943.96

plot(A, xlim=c(1993, 2018))
lines(Ahw.p, col="red")

fhw <- forecast(Ahw, h=24)
fhw

##          Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## Nov 2016       98716.46  96335.83 101097.09  95075.60 102357.32
## Dec 2016       91699.06  88919.78  94478.35  87448.51  95949.61
## Jan 2017       83469.45  80340.02  86598.88  78683.39  88255.50
## Feb 2017       83466.88  80020.98  86912.78  78196.83  88736.93
## Mar 2017       90616.78  86879.52  94354.04  84901.14  96332.42
## Apr 2017       95022.68  91013.69  99031.67  88891.46 101153.90
## May 2017      100555.21  96290.37 104820.06  94032.69 107077.73
## Jun 2017      107303.01 102795.47 111810.56 100409.32 114196.71
## Jul 2017      108718.34 103979.24 113457.44 101470.51 115966.17
## Aug 2017      110939.42 105978.33 115900.51 103352.09 118526.75
## Sep 2017      110378.41 105203.67 115553.14 102464.33 118292.48
## Oct 2017      109622.48 104241.44 115003.52 101392.89 117852.07
## Nov 2017      102037.94  96133.49 107942.38  93007.87 111068.01
## Dec 2017       95020.54  88932.46 101108.63  85709.62 104331.46
## Jan 2018       86790.93  80523.60  93058.25  77205.88  96375.97
## Feb 2018       86788.36  80345.82  93230.90  76935.35  96641.37
## Mar 2018       93938.26  87324.21 100552.31  83822.94 104053.58
## Apr 2018       98344.16  91562.02 105126.30  87971.77 108716.55
## May 2018      103876.69  96929.62 110823.76  93252.07 114501.31
## Jun 2018      110624.49 103515.45 117733.53  99752.15 121496.83
## Jul 2018      112039.82 104771.54 119308.09 100923.95 123155.68
## Aug 2018      114260.90 106835.96 121685.83 102905.44 125616.35
## Sep 2018      113699.88 106120.70 121279.07 102108.52 125291.25
## Oct 2018      112943.96 105212.77 120675.14 101120.13 124767.78

class(fhw)

## [1] "forecast"

autoplot(fhw)

Construction Spending Forecast

Daniel Hong

December 10, 2016