Forecasting with {modeltime} - Part IV

Tuning many global models

Time Series

Forecasting

Machine Learning

Author

Arcenis Rojas

Published

February 20, 2024

Review

Up to this point in the project I acquired data in Forecasting with {modeltime} - Part I, performed some standard time series analysis in Forecasting with {modeltime} - Part II, and explored the two types of modeling processes – global and iterative – for multiple time series in Forecasting with {modeltime} - Part III. In this post I’m going to tune multiple models using a global process to select a small number of them that I will run iteratively in the final post. I’ll incorporate some of the other variables in the data set that I gathered in the the first post to forecast the Case-Shiller Home Price Index (HPI). Since I’ll be creating variations of model specifications, I’m also going to use functions from the Tidymodels workflowsets (Kuhn and Couch 2023) package to run the models. As before I’ll be using modeltime (Dancho 2023), timetk (Dancho and Vaughan 2023), tidymodels (Kuhn and Wickham 2020), tidyverse (Wickham et al. 2019), and gt (Iannone et al. 2024). Before jumping in, below I show a glimpse of the data that I’ll be using.

Code

library(tidyverse)
library(timetk)
library(modeltime)
library(tidymodels)
library(gt)
conflicted::conflict_prefer("filter", "dplyr")
conflicted::conflict_prefer("lag", "dplyr")
tidymodels_prefer()

Code

skimr::skim(econ_data)

Data summary
Name	econ_data
Number of rows	816
Number of columns	16
_______________________
Column type frequency:
character	1
Date	1
numeric	14
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
city	0	1	6	9	0	4	0

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2006-01-01	2022-12-01	2014-06-16	204

Variable type: numeric

skim_variable	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
hpi	1	172.19	64.38	64.47	122.40	168.06	203.87	427.80	▆▇▃▁▁
unemp_rate	1	6.50	2.93	2.80	4.20	5.60	8.30	23.90	▇▃▁▁▁
age_lt_18	1	0.24	0.02	0.19	0.22	0.24	0.25	0.30	▂▇▇▅▂
age_18_35	1	0.25	0.02	0.20	0.24	0.25	0.26	0.31	▁▃▇▃▁
age_36_65	1	0.39	0.02	0.33	0.38	0.39	0.40	0.44	▁▃▇▂▁
age_gt_65	1	0.12	0.02	0.07	0.11	0.12	0.14	0.19	▃▆▇▅▁
educ_hs_less	1	0.33	0.03	0.24	0.31	0.33	0.35	0.40	▁▃▇▇▃
educ_college	1	0.38	0.03	0.31	0.36	0.38	0.40	0.48	▂▇▆▃▁
educ_grad	1	0.10	0.02	0.05	0.08	0.10	0.11	0.16	▃▇▇▅▁
status_married	1	0.40	0.02	0.32	0.39	0.40	0.41	0.45	▁▁▇▇▂
status_nev_mar	1	0.27	0.03	0.19	0.25	0.27	0.29	0.34	▂▅▇▇▂
status_other	1	0.33	0.02	0.28	0.32	0.33	0.34	0.45	▃▇▃▁▁
hispanic	1	0.22	0.11	0.02	0.17	0.25	0.30	0.42	▅▁▆▇▂
population	1	8471150.57	6499655.47	2834025.25	4050215.00	5109791.12	10555276.77	20320876.00	▇▃▁▁▃

Splitting the data

The data split object for a global modeling process is different than that for an iterative modeling process. For the global modeling process I’ll use timetk::time_series_split() . Below is a look at how the data are split.

Code

econ_splits <- econ_data |>
  time_series_split(
    assess = "2 year",
    cumulative = TRUE,
    date_var = date
  )

econ_splits

<Analysis/Assess/Total>
<720/96/816>

Choosing algorithms

As with selecting the cities in the analysis I’m interested in getting a little variation in the algorithms to see how they perform and I want to vary the parameters a bit. To get something of a traditional flavor I’m going to run ARIMA models with three different model formulas:

1) hpi ~ date

2) hpi ~ date + unemployment

3) hpi ~ date + unemployment + demographic variables

I’ll also have four specifications for the algorithms. One will be the default auto-ARIMA and the other three will be auto-ARIMA specifications with different season lengths. This will give me twelve different ARIMA models (3 recipes x 4 algorithm specifications).

In much the same way I will have different sets of recipes and model specifications for the following:

an exponential smoothing algorithm (including a Holt-Winters specification) to allow for weighting the periods exponentially
an STLM algorithm using an ARIMA engine (STLM: Seasonal Trend Decomposition using Loess with Multiple seasonal periods)
A feed-forward, auto-regressive neural-net algorithm because these tend to have high predictive ability

Some of these algorithms allow for covariates and others (like exponential smoothing) do not. In the end I will have 90 model specifications to test. In order to do speed up the processing I’ll employ modeltime’s parallel processing feature.

Building workflow objects

Here I’ll write the three recipes for the ARIMA models and one for the auto-regressive neural-net model. The exponential smoothing and STLM models will both use the first of the three ARIMA recipes as neither of these algorithms can handle covariates.

You might notice in the code below that I exclude some of the demographic variables like “educ_hs_less” for the third ARIMA recipe. I exclude these variables because I found that including all of the variables for a demographic variable group causes the functions to error out much in the same way that including all of the categories of a factor variable would. Initially I did not expect that I would need contrasts considering that the variable values are ratios, but it seems that the fact that the sum of the ratios for a given variable group, e.g., education, is 1 for each row and for each variable group creates a need for contrasts in the regression (this is my guess).

Writing recipes

Code Explanations

Some of the code chunks below contain explanations. Lines in explanations will be marked with encircled numbers. Just hover over those numbers to see the corresponding explanation.

Code

arima_rec1 <- recipe(hpi ~ date, data = training(econ_splits))

arima_rec2 <- recipe(
  hpi ~ date + unemp_rate,
  data = training(econ_splits)
) |>
  step_center(all_numeric_predictors()) |>
  step_scale(all_numeric_predictors()) |>
  step_lag(unemp_rate, lag = c(1, 3, 6))

rec3_dep_vars <- econ_data |>
  select(date, unemp_rate:population) |>
  select(!c(educ_hs_less, status_married, age_36_65)) |>
  names() |>
  str_flatten(collapse = " + ")

arima_rec3 <- recipe(
  formula(str_c("hpi", rec3_dep_vars, sep = " ~ ")),
  data = training(econ_splits)
) |>
  step_center(all_numeric_predictors()) |>
  step_scale(all_numeric_predictors()) |>
  step_lag(unemp_rate, population, lag = c(1, 3, 6))

nnet_rec <- recipe(
  hpi ~ .,
  data = training(econ_splits)
) |>
  update_role(city, new_role = "ID") |>
  step_center(all_numeric_predictors()) |>
  step_scale(all_numeric_predictors()) |>
  step_timeseries_signature(date) |>
  step_nzv(all_predictors())

1: Write a base recipe to regress HPI on the date variable
2: Write a recipe formula that adds unemployment as an explanatory variable
3: Mean-center all numeric predictors (only “unemployment rate” here)
4: Scale all numeric predictors to have a mean of “0” and a standard error of 1
5: Create lag variables for “unemployment rate” and “population”
6: Get the names of all the economic and demographic variables from the data set and concatenate them into a formula string to make the right side of a regression formula, i.e., “var1 + var2 + var3…”
7: Because the “outcome ~ .” notation is being used, the “city” variable is included in the “.”, meaning it will be treated as an explanatory variable. This step changes the role of that variable from “predictor” to an ID variable, which modeling workflows will exclude from sets of predictors.
8: Add a set of variables related to the date such as day of the week of the date, day of the month, number of the week of the year, etc.
9: Remove any variables with zero or near-zero variance. An example of this would be a variable indicating the day of the month. Since this is monthly data, all values would be “1”, i.e., the first day of the month.

Here is a reminder of what a recipe looks like. This is the neural-net recipe.

Code

nnet_rec

And this is the data set that gets created by that recipe.

Code

nnet_rec |> prep() |> juice() |> slice(1:5) |> gt() |> gt_bold_head()

city	date	unemp_rate	age_lt_18	age_18_35	age_36_65	age_gt_65	educ_hs_less	educ_college	educ_grad	status_married	status_nev_mar	status_other	hispanic	population	hpi	date_index.num	date_year	date_year.iso	date_half	date_quarter	date_month	date_month.xts	date_month.lbl	date_wday	date_wday.xts	date_wday.lbl	date_qday	date_yday	date_mweek	date_week	date_week.iso	date_week2	date_week3	date_week4
Dallas	2006-01-01	-0.5318986	1.9837898	1.2244300	-2.1179446	-1.4144536	1.4132488	-1.9352723	-1.0007519	0.7138177	-1.8445270	1.9655339	0.2421831	-0.4128166	121.9108	1136073600	2006	2005	1	1	1	0	January	1	0	Sunday	1	1	5	1	52	1	1	1
San Diego	2006-01-01	-0.8651452	-0.2639636	-0.5419655	-0.1248023	0.7104276	-0.6271533	1.1018075	-0.5165046	-0.8376538	-0.5633246	1.4600357	0.4759810	-0.8614678	247.4588	1136073600	2006	2005	1	1	1	0	January	1	0	Sunday	1	1	5	1	52	1	1	1
New York	2006-01-01	-0.5652232	0.4335013	-0.7101494	-0.3141128	0.2439870	1.6923074	-1.9655377	0.3044405	-0.5241964	-0.2666740	0.7951381	-0.1053321	1.5314733	213.4958	1136073600	2006	2005	1	1	1	0	January	1	0	Sunday	1	1	5	1	52	1	1	1
Detroit	2006-01-01	0.1679193	1.1233044	-1.7256753	-0.3452176	0.4389330	1.7816712	-1.6797175	-1.2833046	-0.7472264	-1.1655961	2.2179937	-1.6335110	-0.6156132	126.6627	1136073600	2006	2005	1	1	1	0	January	1	0	Sunday	1	1	5	1	52	1	1	1
Dallas	2006-02-01	-0.4985739	2.4431155	1.3332951	-2.2373194	-1.7310508	0.7705532	-0.9746432	-1.6783801	1.0008565	-1.5342766	1.3033739	0.4153678	-0.4092439	121.3285	1138752000	2006	2006	1	1	2	1	February	4	3	Wednesday	32	32	5	5	5	1	2	1

Specifying models

In this next step I’ll specify the model objects for each algorithm along with the search grids for their various hyperparameters. I chose to keep only some of the combinations of the exponential smoothing specifications to limit the number of models being run that are likely to fail.

Code

arima_default <- arima_reg() |> set_engine("auto_arima")

arima_spec_grid <- tibble(seasonal_period = seq(9, 18, by = 3)) |>
  create_model_grid(
    f_model_spec = arima_reg,
    engine_name = "auto_arima",
    mode = "regression"
  )

ets_default <- exp_smoothing() |> set_engine("ets")

set.seed(40)
alpha_vals <- runif(5, 0.01, 0.5) *
  sample(c(1, 0.1, 0.01, .001), 5, replace = TRUE)
beta_vals <- runif(5, 0.01, 0.2) * sample(c(1, 0.1, 0.01), 5, replace = TRUE)
gamma_vals <- map_dbl(
  alpha_vals,
  \(x) runif(1, 0, 1 - x)
)

ets_spec_grid <- expand_grid(
  smooth_level = alpha_vals[-2],
  smooth_seasonal = gamma_vals[2:3],
  seasonal_period = seq(9, 18, by = 3)
) |>
  create_model_grid(
    f_model_spec = exp_smoothing,
    engine_name = "ets",
    mode = "regression",
    error = "auto",
    trend = "auto",
    season = "auto",
    smooth_trend = beta_vals[3]
  )

stlm_spec_grid <- expand_grid(
  seasonal_period_2 = c(NULL, seq(6, 18, by = 3))
) |>
  create_model_grid(
    f_model_spec = seasonal_reg,
    engine_name = "stlm_arima",
    mode = "regression",
    seasonal_period_1 = 12
  )

nnet_default <- nnetar_reg() |> set_engine("nnetar")

nnet_spec_grid <- expand_grid(
  non_seasonal_ar = 1:3,
  hidden_units = 3:5,
  num_networks = c(20, 25),
  penalty = c(0.2, 0.25)
) |>
  create_model_grid(
    f_model_spec = nnetar_reg,
    engine_name = "nnetar",
    mode = "regression",
    seasonal_period = 12,
    epochs = 100,
    seasonal_ar = 1
  )

1: Specify an ARIMA regression model using the “auto_arima” engine with all default arguments
2: Create a 1-column tibble (data frame) with the seasonal_period argument set to vary as 9, 12, 15, and 18 (months).
3: create_model_grid() allows one to set static arguments to combine them with the arguments that vary. This is meant to serve a similar purpose as tune::tune_grid() from Tidymodels.
4: Set a random seed for reproducibility and generate vectors for the smooth_level, smooth_trend, and smooth_seasonal arguments. In forecast::ets() these are alpha, beta, and gamma respectively.
5: Use tidyr::expand_grid() to get all combinations of the included vectors.

Below is what a model grid looks like.

Code

ets_spec_grid |> slice(1:5)

# A tibble: 5 × 4
  smooth_level smooth_seasonal seasonal_period .models  
         <dbl>           <dbl>           <dbl> <list>   
1       0.0345           0.569               9 <spec[+]>
2       0.0345           0.569              12 <spec[+]>
3       0.0345           0.569              15 <spec[+]>
4       0.0345           0.569              18 <spec[+]>
5       0.0345           0.763               9 <spec[+]>

This is a glimpse of a model specification.

Code

ets_spec_grid |> slice(1) |> pluck(".models", 1)

Exponential Smoothing State Space Model Specification (regression)

Main Arguments:
  seasonal_period = 9
  error = auto
  trend = auto
  season = auto
  smooth_level = 0.0344955188313033
  smooth_trend = 0.0247416579537094
  smooth_seasonal = 0.569256213388705

Computational engine: ets

Building workflowsets

A workflowset is, you guessed it, a set of workflows. Mechanically it is a very convenient wrapper that can put together different combinations of recipes (pre-processors) and models. I don’t want to mix all of my recipes with all of my models, however. So, in this step I’ll make sets of individuals that combine the appropriate sets of recipes and models then I’ll bind all of them together at the end.

Code

arima_wfset <- workflow_set(
  preproc = list(
    base_rec = arima_rec1,
    econ_rec = arima_rec2,
    demog_rec = arima_rec3
  ),
  models = c(
    arima_default = list(arima_default),
    arima_spec = arima_spec_grid$.models
  ),
  cross = TRUE
)

ets_wfset <- workflow_set(
  preproc = list(base_rec = arima_rec1),
  models = c(ets_default = list(ets_default), ets_spec = ets_spec_grid$.models),
  cross = TRUE
)

stlm_wfset <- workflow_set(
  preproc = list(base_rec = arima_rec1),
  models = c(stlm_spec = stlm_spec_grid$.models),
  cross = TRUE
)

nnet_wfset <- workflow_set(
  preproc = list(nnet_rec = nnet_rec),
  models = c(
    nnet_default = list(nnet_default),
    nnet_spec = nnet_spec_grid$.models
  ),
  cross = TRUE
)

1: Call workflowsets::workflowset()
2: Provide a named list of recipes to use in the workflowset; naming the elements at this stage makes it easier to match up the elements of the final table containing the trained models with the original recipes and model specifications
3: Provide a named list of model specifications by pulling the “.models” column from each grid
4: Indicate that all recipes should be combined with all model specifications

The individual specifications in the final model table will be named using a combination of the name of the recipe and the name of the model specification given in the respective lists with a count added whenever there are duplicate names. For example, arima_spec_grid has four model specifications in it. The ones that are combined with the first ARIMA recipe (named “base_rec”) will be named BASE_REC_ARIMA_SPEC1, BASE_REC_ARIMA_SPEC2, BASE_REC_ARIMA_SPEC3, and BASE_REC_ARIMA_SPEC4.

With all of the model-specific workflow sets created, now it’s time to put them all together.

Code

hpi_wfset <- bind_rows(arima_wfset, ets_wfset, stlm_wfset, nnet_wfset)

hpi_wfset |> slice(1:5)

# A workflow set/tibble: 5 × 4
  wflow_id               info             option    result    
  <chr>                  <list>           <list>    <list>    
1 base_rec_arima_default <tibble [1 × 4]> <opts[0]> <list [0]>
2 base_rec_arima_spec1   <tibble [1 × 4]> <opts[0]> <list [0]>
3 base_rec_arima_spec2   <tibble [1 × 4]> <opts[0]> <list [0]>
4 base_rec_arima_spec3   <tibble [1 × 4]> <opts[0]> <list [0]>
5 base_rec_arima_spec4   <tibble [1 × 4]> <opts[0]> <list [0]>

That’s all the heavy lifting. Now it’s time to let the computer processors do their part and run the models.

Fitting the models

Code

avail_cores <- unname(parallelly::availableCores() - 1)
modulo_vec <- sapply(avail_cores:2, \(x) nrow(hpi_wfset) %% x)
           
num_cores <- max(
  max(which(modulo_vec == max(modulo_vec))),
    max(which(modulo_vec == 0))
) + 1

cl <- parallel::makePSOCKcluster(num_cores)
doParallel::registerDoParallel(cl)
invisible(
  parallel::clusterCall(cl, function(x) .libPaths(x), .libPaths())
)
invisible(parallel::clusterEvalQ(cl, set.seed(500)))

hpi_mods_start <- proc.time()

hpi_mods <- hpi_wfset |>
  modeltime_fit_workflowset(
    data = training(econ_splits),
    control = control_fit_workflowset(verbose = FALSE, allow_par = TRUE)
  )

hpi_mods_end <- proc.time()

parallel::stopCluster(cl)
rm(cl)

1: Get the number of cores available and leave one out for common tasks like running the operating system
2: Calculate the number of cores to use in order to maximize efficiency
3: Set up the cluster for parallel processing; I copied this code from modeltime::parallel_start()
4: Set the same random seed for each node
5: Call the workflowset object containing all the workflows to run
6: Use modeltime::modeltime_fit_workflowset() to fit all of the models
7: Use the training portion of the split object to train the models
8: Set some controls for the model fitting function
9: Stop the parallel cluster from running

Code

hpi_mods |> as_tibble()

# A tibble: 90 × 3
   .model_id .model     .model_desc           
       <int> <list>     <chr>                 
 1         1 <workflow> BASE_REC_ARIMA_DEFAULT
 2         2 <workflow> BASE_REC_ARIMA_SPEC1  
 3         3 <workflow> BASE_REC_ARIMA_SPEC2  
 4         4 <workflow> BASE_REC_ARIMA_SPEC3  
 5         5 <workflow> BASE_REC_ARIMA_SPEC4  
 6         6 <workflow> ECON_REC_ARIMA_DEFAULT
 7         7 <workflow> ECON_REC_ARIMA_SPEC1  
 8         8 <workflow> ECON_REC_ARIMA_SPEC2  
 9         9 <workflow> ECON_REC_ARIMA_SPEC3  
10        10 <workflow> ECON_REC_ARIMA_SPEC4  
# ℹ 80 more rows

Code

fit_time <- (hpi_mods_end - hpi_mods_start) |>
  enframe("time_metric", "seconds") |>
  drop_na(seconds)

fit_time |> gt() |> gt_bold_head()

time_metric	seconds
user.self	1.48
sys.self	0.26
elapsed	83.06

About 83.06 seconds elapsed during model-fitting with parallel processing. Not too bad. Unfortunately, though, some of the models failed (I suppressed error printing). So, before calibrating the models I’m going to drop those from the model output data set. But first I’ll investigate the model failures a little.

Investigating failed models

Here’s a look at some of the models that failed.

Code

mod_fail_df <- hpi_mods |>
  mutate(mod_fail = map_lgl(.model, \(x) is.null(x))) |>
  select(!.model)

mod_fail_df |>
  group_by(mod_fail) |>
  slice_min(.model_id, n = 3) |>
  gt(groupname_col = NULL) |>
  gt_bold_head()

.model_id	.model_desc	mod_fail
1	BASE_REC_ARIMA_DEFAULT	FALSE
2	BASE_REC_ARIMA_SPEC1	FALSE
3	BASE_REC_ARIMA_SPEC2	FALSE
33	BASE_REC_ETS_SPEC17	TRUE
34	BASE_REC_ETS_SPEC18	TRUE
35	BASE_REC_ETS_SPEC19	TRUE

Now I want to see the distinct names of the recipes and algorithms (model specifications) that failed.

Code

failure_spec_df <- mod_fail_df |>
  filter(mod_fail) |>
  separate_wider_delim(
    .model_desc,
    delim = "_",
    names = c("rec_name", "rec", "algo", "spec")
  ) |>
  separate_wider_regex(spec, patterns = c(spec = "SPEC", spec_num = "\\d+"))

failure_spec_df |> distinct(rec_name, algo, spec) |> gt() |> gt_bold_head()

rec_name	algo	spec
BASE	ETS	SPEC

Since the failures occurred with only one algorithm they’ll be easy to isolate. Now I want to get the numbers of the specifications that failed so that I can pull up the details on those specifications.

Code

failures <- failure_spec_df |>
  pull(spec_num) |>
  as.numeric()

ets_failed <- ets_spec_grid |>
  slice(failures) |>
  select(!.models)

ets_failed |>
  count(smooth_level) |>
  gt() |>
  gt_bold_head()

smooth_level	n
6.680868e-05	8

Overall there were 8 out of 32 failed exponential smoothing models. That’s not terrible. I’m going to remove these specifications from the trained model data set and move on to model evaluation.

Model evaluation

To evaluate the models first I need to calibrate them. I’ll remove the failed models, calibrate the successful ones, and store accuracy tables here. Below are the 10 best-performing global models, i.e., the ones that performed best on the batched series with all four cities. I defined “best” as the models with the lowest SMAPE (Scaled Mean Average Percent Error) values.

Breaking up calibration step

I broke up the code for calibrating the models below into two separate chunks to keep the file sizes of cached files a bit smaller. I cached the output to make the page easier to render. This step of splitting up the modeltime_table for calibration is not typically necessary.

Code

hpi_mods <- hpi_mods |> slice(which(!mod_fail_df$mod_fail))

num_mods <- nrow(hpi_mods)
hpi_mod_idx <- split(1:num_mods, ceiling(seq_along(1:num_mods) / (num_mods / 3)))

Code

hpi_mod_calib1 <- hpi_mods |>
  slice(hpi_mod_idx[[1]]) |>
  modeltime_calibrate(testing(econ_splits), id = "city", quiet = TRUE)

Code

hpi_mod_calib2 <- hpi_mods |>
  slice(hpi_mod_idx[[2]]) |>
  modeltime_calibrate(testing(econ_splits), id = "city", quiet = TRUE)

Code

hpi_mod_calib3 <- hpi_mods |>
  slice(hpi_mod_idx[[3]]) |>
  modeltime_calibrate(testing(econ_splits), id = "city", quiet = TRUE)

Code

hpi_mod_calib <- bind_rows(hpi_mod_calib1, hpi_mod_calib2, hpi_mod_calib3)

Code

hpi_mod_accuracy <- hpi_mod_calib |>
  modeltime_accuracy() |>
  drop_na(.type)

hpi_local_mod_accuracy <- hpi_mod_calib |>
  modeltime_accuracy(acc_by_id = TRUE) |>
  drop_na(.type)

best_global_table <- hpi_mod_accuracy |>
  slice_min(smape, n = 10)

best_local_table <- hpi_local_mod_accuracy |>
  group_by(city) |>
  slice_min(smape, n = 3)

best_global_table |>
  table_modeltime_accuracy(
    .interactive = FALSE,
    .title = "Global Accuracy"
  ) |>
  gt_bold_head() |>
  tab_style(
    style = cell_text(weight = "bold"),
    locations = cells_title()
  )

Global Accuracy
.model_id	.model_desc	.type	mae	mape	mase	smape	rmse	rsq
16	BASE_REC_ETS_DEFAULT	Test	19.29	6.22	0.18	6.59	28.08	0.96
1	BASE_REC_ARIMA_DEFAULT	Test	20.79	7.14	0.19	7.37	29.50	0.95
3	BASE_REC_ARIMA_SPEC2	Test	20.79	7.14	0.19	7.37	29.50	0.95
30	BASE_REC_ETS_SPEC14	Test	20.81	7.48	0.19	7.60	28.89	0.95
22	BASE_REC_ETS_SPEC6	Test	20.80	7.48	0.19	7.60	28.87	0.95
46	BASE_REC_ETS_SPEC30	Test	22.16	7.53	0.21	7.82	31.18	0.95
42	BASE_REC_ETS_SPEC26	Test	22.31	7.77	0.21	8.00	31.01	0.95
53	BASE_REC_STLM_SPEC5	Test	30.98	9.75	0.29	10.59	43.56	0.94
49	BASE_REC_STLM_SPEC1	Test	31.20	9.81	0.29	10.68	43.87	0.93
52	BASE_REC_STLM_SPEC4	Test	32.10	10.02	0.30	10.94	45.27	0.93

Next is a table with the best 3 models calibrated to each city.

Code

best_local_table|>
  table_modeltime_accuracy(
    .interactive = FALSE,
    .title = "Accuracy by City"
  ) |>
  gt_bold_head() |>
  tab_style(
    style = cell_text(weight = "bold"),
    locations = cells_title()
  ) |>
  tab_style(
    style = list(cell_text(weight = "bold"), cell_fill(color = "lightgray")),
    locations = cells_group()
  ) |>
  tab_style(
    style = cell_borders(
      sides = c("top", "bottom"),
      weight = px(2),
      style = ("solid")
    ),
    locations = cells_group()
  )

Accuracy by City
.model_id	.model_desc	.type	mae	mape	mase	smape	rmse	rsq
Dallas
12	DEMOG_REC_ARIMA_SPEC1	Test	13.99	5.18	2.65	5.32	15.79	0.80
14	DEMOG_REC_ARIMA_SPEC3	Test	14.19	5.24	2.69	5.39	15.97	0.80
22	BASE_REC_ETS_SPEC6	Test	20.59	7.34	3.93	7.72	24.76	0.85
Detroit
76	NNET_REC_NNET_SPEC22	Test	2.73	1.64	1.55	1.65	3.71	0.87
63	NNET_REC_NNET_SPEC9	Test	2.84	1.71	1.62	1.74	3.92	0.89
66	NNET_REC_NNET_SPEC12	Test	2.92	1.77	1.66	1.74	4.49	0.87
New York
1	BASE_REC_ARIMA_DEFAULT	Test	3.09	1.16	1.30	1.16	4.41	0.94
3	BASE_REC_ARIMA_SPEC2	Test	3.09	1.16	1.30	1.16	4.41	0.94
30	BASE_REC_ETS_SPEC14	Test	3.18	1.19	1.34	1.17	5.06	0.95
San Diego
51	BASE_REC_STLM_SPEC3	Test	23.42	6.49	3.10	6.34	28.83	0.61
12	DEMOG_REC_ARIMA_SPEC1	Test	35.62	9.29	4.85	9.84	39.31	0.71
14	DEMOG_REC_ARIMA_SPEC3	Test	36.77	9.58	5.01	10.16	40.32	0.71

There’s a pretty good mix of models across the two tables including ARIMA, ETS (lots of this one), STLM, and neural-net. Among the ARIMA models there were some with the basic recipe and some with the recipe that included both economic and demographic variables.

Below are plots of forecasts of the test data period for each city using the best 3 global models.

Code

top_3_global <- best_global_table |> slice_min(smape, n = 3) |> pull(.model_id)

hpi_mod_calib |>
  filter(.model_id %in% top_3_global) |>                            
  modeltime_forecast(                            
    new_data = testing(econ_splits),      
    actual_data = econ_data,                     
    conf_by_id = TRUE,                           
    keep_data = TRUE                             
  ) |>
  group_by(city) |>                             
  plot_modeltime_forecast(                      
    .interactive = FALSE,
    .title = "Forecast of Test Data",
    .facet_ncol = 1
  ) +
  theme(plot.title = element_text(hjust = 0.5))

Just for fun, here are the 5 worst-performing models (global).

Code

hpi_mod_accuracy |>
  slice_max(smape, n = 5, with_ties = FALSE) |>
  gt() |>
  gt_bold_head()

.model_id	.model_desc	.type	mae	mape	mase	smape	rmse	rsq
2	BASE_REC_ARIMA_SPEC1	Test	82.07379	28.40303	0.7639937	32.64595	100.22516	0.0237270639
4	BASE_REC_ARIMA_SPEC3	Test	81.85927	28.93424	0.7619969	32.63484	99.20593	0.0005985376
17	BASE_REC_ETS_SPEC1	Test	81.34845	28.53501	0.7572419	32.31949	99.36593	NA
19	BASE_REC_ETS_SPEC3	Test	81.34845	28.53501	0.7572419	32.31949	99.36593	NA
20	BASE_REC_ETS_SPEC4	Test	81.34845	28.53501	0.7572419	32.31949	99.36593	NA

Forecasting with global models

Below are forecasts for each city using the top 3 global models.

Code

hpi_refit <- hpi_mod_calib |>
  filter(.model_id %in% top_3_global) |>
  modeltime_refit(data = econ_data) 

hpi_future <- econ_data |>
  group_by(city) |>
  future_frame(.length_out = 12, .bind_data = FALSE, .date_var = date)

hpi_future_forecast <- hpi_refit |>
  modeltime_forecast(
    new_data = hpi_future,
    actual_data = econ_data,
    conf_by_id = TRUE
  )

hpi_future_forecast |>
  group_by(city) |>
  plot_modeltime_forecast(
    .interactive = FALSE,
    .title = "1 Year Forecast into the Future - Global",
    .facet_ncol = 1
  ) +
  theme(plot.title = element_text(hjust = 0.5))

Choosing models for iterative process

For iterative modeling I’m going to use the 10 best global models and the 3 best local models for each city (there’s some overlap in these lists). The specifications that I’m going to use for iterative modeling in the next post are:

Code

best_global_table |>
  slice_min(smape, n = 5) |>
  bind_rows(best_local_table) |>
  distinct(.model_desc) |>
  gt() |>
  gt_bold_head()

.model_desc
BASE_REC_ETS_DEFAULT
BASE_REC_ARIMA_DEFAULT
BASE_REC_ARIMA_SPEC2
BASE_REC_ETS_SPEC14
BASE_REC_ETS_SPEC6
DEMOG_REC_ARIMA_SPEC1
DEMOG_REC_ARIMA_SPEC3
NNET_REC_NNET_SPEC22
NNET_REC_NNET_SPEC9
NNET_REC_NNET_SPEC12
BASE_REC_STLM_SPEC3

References

Dancho, Matt. 2023. “Modeltime: The Tidymodels Extension for Time Series Modeling.” https://CRAN.R-project.org/package=modeltime.

Dancho, Matt, and Davis Vaughan. 2023. “Timetk: A Tool Kit for Working with Time Series.” https://CRAN.R-project.org/package=timetk.

Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2024. “Gt: Easily Create Presentation-Ready Display Tables.” https://CRAN.R-project.org/package=gt.

Kuhn, Max, and Simon Couch. 2023. “Workflowsets: Create a Collection of ’Tidymodels’ Workflows.” https://CRAN.R-project.org/package=workflowsets.

Kuhn, Max, and Hadley Wickham. 2020. “Tidymodels: A Collection of Packages for Modeling and Machine Learning Using Tidyverse Principles.” https://www.tidymodels.org.

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the Tidyverse” 4: 1686. https://doi.org/10.21105/joss.01686.