The Value of Construction Put in Place Survey (VIP) provides monthly estimates of the total dollar value of construction work done in the U.S. The survey covers construction work done each month on new structures or improvements to existing structures for private and public sectors. Data estimates include the cost of labor and materials, cost of architectural and engineering work, overhead costs, interest and taxes paid during construction, and contractor’s profits. Data collection and estimation activities begin on the first day after the reference month and continue for about three weeks. Reported data and estimates are for activity taking place during the previous calendar month. The survey periods in this analysis covers January 1993 to October 2016. Construction represents ~8% of US GDP and is a very closely watched indicator. After initial exploratory analysis and attempt will be made to forecast the Total number.

Load libraries required for the analysis

options(warn=-1)
library(ggplot2)
library(reshape2)
library(urca)
library(seasonal)
library(lmtest)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(gridBase)
library(forecast)
## Loading required package: timeDate
## This is forecast 7.3
library(zoo)

The initial plot shows construction spend over time with the average spend. The barchart is just another view of the same data. The line plot shows seasonality in the data.

setwd("C:/Users/dhong/Documents/R")
con<-scan("totdata.txt")
con<-ts(con, start=c(1993,1),freq=12)
plot(con,main="Total Construction Spending", ylab="Millions of Dollars")
abline(h=71535.10,col='blue',lwd=2)

summary(con)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   30260   57220   72110   71540   84340  109600
sd(con)
## [1] 19140.47
df<-read.table("cons_annual.txt", header = FALSE)
df
##      V1      V2
## 1  1993  485549
## 2  1994  531890
## 3  1995  548667
## 4  1996  599694
## 5  1997  631849
## 6  1998  688520
## 7  1999  744551
## 8  2000  802758
## 9  2001  840247
## 10 2002  847877
## 11 2003  891498
## 12 2004  991357
## 13 2005 1116811
## 14 2006 1161282
## 15 2007 1147953
## 16 2008 1077351
## 17 2009  906544
## 18 2010  809254
## 19 2011  788331
## 20 2012  850456
## 21 2013  906351
## 22 2014 1005629
## 23 2015 1112433
## 24 2016  972188
p <- ggplot(df, aes(V1, V2, group = 1)) + geom_bar(stat = "identity") + labs(x = "Year", y = "Spend $M", title = "Total Construction Spending")
p

Decomposition of the time series shows the observed data broken down by random, seasonal and trend. Seasonal adjustment is made and unit root test is performed. First line chart is seasonally adjusted construction spend and the second plot shows the comparison between seasonally adjusted and non-seasonally adjusted data. First differences look stationary and similar to noise.

dec_con<-decompose(con)
plot(dec_con)

con_sa<-con-dec_con$seasonal
plot(con_sa)

df1<-read.table("sa.txt", header = FALSE)
df1
##     V1    V2     V3
## 1  Jan 84195  71447
## 2  Feb 84641  71159
## 3  Mar 86831  79751
## 4  Apr 90725  88308
## 5  May 92835  94756
## 6  Jun 95831 102716
## 7  Jul 97129 104958
## 8  Aug 97083 106856
## 9  Sep 98663 106868
## 10 Oct 97063 103844
## 11 Nov 93861  94876
## 12 Dec 93575  86894
qplot(V2, V3, data = df1, geom = "line",
    xlab = "Seasonally Adjusted", ylab = "Non-Adjusted Construction Spend",
    main = "Construction Spending Comparison")

plot(diff(con_sa))

The function acf computes estimates of the autocovariance or autocorrelation function. Function pacf is the function used for the partial autocorrelations. A further look at acf and pacf shows the sample autocorrelations are close to 1, the first partial autocorrelation is also close to 1 but the others are not significant.

par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(con_sa, main="Total Puplic Construction Spending (SA)")
pacf(con_sa)

acz <- acf(con_sa, plot=F)
acd <- data.frame(lag=acz$lag, acf=acz$acf)
ggplot(acd, aes(lag, acf)) + geom_area(fill="grey") +
  geom_hline(yintercept=c(0.05, -0.05), linetype="dashed") +
  theme_bw()

Significant but small autocorrelation

acf(diff(log(con_sa)))

pacf(diff(log(con_sa)))

Dickey-Fuller test of the null hypothesis to see whether a unit root is present in an autoregressive model.

test <- ur.df(con_sa,type=c("trend"),lags=3,selectlags="AIC")
summary(test)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4391.5  -914.7    31.5   944.9  4250.7 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 852.737805 424.605420   2.008   0.0456 *  
## z.lag.1      -0.013249   0.007814  -1.695   0.0911 .  
## tt            1.717111   1.622346   1.058   0.2908    
## z.diff.lag1   0.052140   0.057870   0.901   0.3684    
## z.diff.lag2   0.257464   0.057516   4.476 1.11e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1480 on 277 degrees of freedom
## Multiple R-squared:  0.07744,    Adjusted R-squared:  0.06412 
## F-statistic: 5.813 on 4 and 277 DF,  p-value: 0.0001663
## 
## 
## Value of test-statistic is: -1.6955 1.9643 1.484 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.98 -3.42 -3.13
## phi2  6.15  4.71  4.05
## phi3  8.34  6.30  5.36

Compare construction spend from October 1993 to September 2016 and the log

con<-con[10:285]
con<-ts(con, start=c(1993,10),freq=12)
con
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1993                                                                 45361
## 1994  34917  33222  38215  41754  45711  49032  49545  51516  51718  49135
## 1995  37206  35250  40016  42824  46615  49804  50219  52628  53028  51372
## 1996  39356  37391  42052  46865  51598  54703  55430  57488  58544  57456
## 1997  41882  40788  45704  48998  53617  56977  58826  60682  61323  60053
## 1998  44409  43150  49602  54017  57723  64393  64381  66137  66544  64600
## 1999  48746  48824  55607  58654  62209  67545  68522  69975  70034  68669
## 2000  53782  53993  61295  64524  69253  72715  71774  76132  75153  73550
## 2001  56855  55529  62808  67787  72920  78004  78281  79916  76605  76302
## 2002  59516  58588  63782  69504  73384  77182  78863  79460  76542  75710
## 2003  59877  58526  64506  69638  74473  80377  82971  85191  83841  83133
## 2004  64934  64138  73238  78354  83736  89932  93614  96164  92538  90582
## 2005  72458  73094  81791  88032  93704 100678 103875 107453 104682 104039
## 2006  82400  82381  92354  97056 101862 106777 107150 108598 103102 100721
## 2007  79009  78501  87421  93644  99690 105020 106779 109573 104744 104313
## 2008  78039  77921  83384  89092  93316  97882 100234 100366  97303  96609
## 2009  67301  67030  71985  75043  76661  82089  83885  84426  82037  79952
## 2010  55362  54986  60990  66565  68903  74806  73918  76554  75818  73386
## 2011  50973  51017  57148  61590  65430  72495  72253  76986  75871  73783
## 2012  56608  56994  61803  67002  72228  77580  78305  81152  79404  80287
## 2013  58821  58898  64190  70601  75775  80997  84346  86776  85825  86551
## 2014  67391  66916  73899  80749  85599  90712  92585  93261  93193  94888
## 2015  71447  71159  79751  88308  94756 102716 104958 106856 106868 103844
## 2016  78610  79903  88984  92208  97620 105130 106633 109097 107702       
##         Nov    Dec
## 1993  44786  39535
## 1994  46719  40406
## 1995  48163  41542
## 1996  53498  45313
## 1997  54968  48031
## 1998  60469  53095
## 1999  66684  59082
## 2000  69677  60910
## 2001  71543  63697
## 2002  71362  63984
## 2003  77915  71050
## 2004  86394  77733
## 2005  98348  88657
## 2006  93850  85031
## 2007  94934  84325
## 2008  86093  77112
## 2009  71527  64608
## 2010  67318  60648
## 2011  68203  62582
## 2012  73071  66022
## 2013  79695  73876
## 2014  86262  80174
## 2015  94876  86894
## 2016
plot(con,main= "Total Construction Spending")

lncon<-log(con)
plot(lncon,main= "Total Construction Spending")

Model 1 with trend, show linear trend. Notice the redsidual plot confirming seasonality.

t<-(1:length(lncon))
t2<-t^2
model1<-lm(lncon~t+t2)
summary(model1)
## 
## Call:
## lm(formula = lncon ~ t + t2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.50306 -0.11809  0.02093  0.11543  0.35039 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.061e+01  3.087e-02  343.70   <2e-16 ***
## t            7.215e-03  5.147e-04   14.02   <2e-16 ***
## t2          -1.780e-05  1.799e-06   -9.89   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1697 on 273 degrees of freedom
## Multiple R-squared:  0.6034, Adjusted R-squared:  0.6005 
## F-statistic: 207.7 on 2 and 273 DF,  p-value: < 2.2e-16
dwtest(model1)
## 
##  Durbin-Watson test
## 
## data:  model1
## DW = 0.17321, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0
plot(lncon, ylim=c(7.8,12),main="Total Construction Spending")
trend<-fitted(model1)
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)
par(new=T)
residuals<-lncon-trend
plot(residuals,ylim=c(-0.35,1.5),ylab='',axes=F,main="Total Construction Spending")
axis(4, pretty(c(-0.3,0.3)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)

model1<-lm(lncon~t)

plot(lncon, ylim=c(8.5,12),xlim=c(2010,2016),main="Trend")
trend<-fitted(model1)
trend<-ts(trend, frequency=12, start=c(1993,10))
lines(trend,col="blue",lwd=1.5)
par(new=T)
residuals<-lncon-trend
plot(residuals,ylim=c(-0.5,1.5),ylab='',axes=F,main="Total Construction Spending")
axis(4, pretty(c(-0.4,0.4)))
par(mar=c(5.1, 4.1, 4.1, 2.1) + 1.2)
abline(h=0,col='grey')
mtext("Residuals",side=4,line=2,at=0)

par(mfrow=c(2,1), mar=c(3,5,3,3))
acf(residuals)
pacf(residuals)