Time Series Forecasting
Case Study
This case study centers around Airbnb's earnings performance, which recently fell short of expectations, resulting in a significant stock market decline. The challenges faced by the company are particularly pronounced in the United States, with a notable struggle in the state of New York. My primary objective was to develop a demand forecasting model for the upcoming month.
The Project
The project is structured into different sections to facilitate the development of forecasting models and the creation of ensemble forecasts. Each section has a specific focus, and they work together to achieve the project's goals.​
Exploratory Data Analysis
In the initial project phase, the exploration of time series analysis includes taking a closer look at the dataset and breaking it down into its seasonal components to better understand the underlying patterns. Here's a breakdown of what we're doing:
​
-
Setting Up the Environment:
Firstly, we import the necessary libraries for data manipulation, statistical analysis, and visualization.
​
​
​​
​
​​
-
Loading the Dataset
The dataset is stored in a ​CSV file which we load using 'pandas'. We ensure that the 'Date' column is parsed as a datetime object and set as the index. ​​​​​​​​
​
-
​​​Renaming the columns
For clarity and to follow Python’s naming conventions, we rename the 'Demand' column to 'y':
​
​
​
​
​​
-
Decomposition:
First, we perform seasonal decomposition on the time series data. This involves breaking the data into its trend, seasonal, and residual components using a multiplicative model with a seasonal period of 365 days. We then visualize these components to gain insights into the data's underlying structure.
​
​
​
​
​
Output:
​​
​​
​
​
​
​
​
​
​
​
​
​
​
We display the trend component of the decomposition. This shows the long-term trajectory of the data, helping us identify any overall patterns or trends.
​
We also visualize the residual component of the decomposition. This represents the unexplained variability in the data after removing the trend and seasonal components. Analyzing the residuals can help us identify any irregularities or outliers in the data.
​​
-
Seasonal Graphs:
Next, we plot seasonal graphs based on different time frames: monthly and quarterly. To do this, we transform the daily data into mean monthly data and mean quarterly data. This allows us to visualize how the data exhibits seasonal patterns on different time scales.
Output:
​
​
​
​
​
​
​
​
​
​
​
​
​​​
-
Aggregating Data:
To further understand the seasonal patterns, we aggregate the data based on a monthly time frame. This aggregation provides insights into the average values of the data for each month.
​
​
​
Output:
​
​
​
Advanced Visualization
We delve into advanced data visualization techniques to gain deeper insights into the relationship between temperature, demand, and other factors in our dataset.​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​​
​
Output:
​
​
​
​
​
​
​
​
​
​
​
-
​ Calculating Correlation:
We calculate the correlation between demand, temperature, and marketing factors to assess their relationships. This correlation analysis provides insights into how these variables are interrelated.
​
​
​
​
​
-
Autocorrelation Plot:
We create an autocorrelation function (ACF) plot to visualize the autocorrelation of the demand data. The plot helps us understand how each data point relates to previous data points in a time series.
​
​
​
​
Output:
​
​
​
​
​
​
​
​
​
​
​
​
​
​
Inference: The ACF plot reveals that the initial bar is significantly high, indicating that the previous day's data is a strong predictor of today's values. As we look further to the right, observing higher lags, the height of the bars decreases. This trend suggests that the more distant the past values are, the less impact they have on future values. However, these bars maintain a relatively significant height, implying that historical data continue to exert some influence over a prolonged period.
​
The gradual decrease in the bar heights, rather than a sharp drop-off, combined with the occurrence of periodic spikes, hints at a relationship between today's values and those from several days ago. This pattern could be indicative of a cyclical or seasonal trend, such as increased purchases on weekends or during holiday periods.
Facebook Prophet Forecast
In this section, we are using the Prophet library developed by Facebook to perform time series forecasting. Prophet is designed for forecasting time series data based on an additive model that fits non-linear trends with yearly, weekly, and daily seasonality, plus holiday effects.
-
Importing Libraries:
​
​
​​
-
​Renaming Columns:
For Prophet to understand our data, it requires the columns to be named as 'ds' for the date and 'y' for the value we want to forecast (in this case, 'Demand'). So, we rename our columns accordingly.​
​
​
​
​
-
​ Formatting the Date:
The date column 'ds' is transformed to a datetime object, ensuring Prophet can recognize and work with it. The describe() function gives a summary of the date column, telling us about its range and frequency.​
​
​
Incorporating Holidays into the Forecast
One of the unique features of Facebook's Prophet is its ability to incorporate the effects of holidays and special events into its forecasts. This can be particularly valuable when certain dates or periods have a consistent impact on the time series data. In this section, we’re accounting for two major holidays: Easter and Thanksgiving.
-
Accounting for Easter:
​
​
​
Here, we're creating a data frame for Easter that contains the dates when Easter occurs. The lower_window and upper_window parameters define a range around the holiday date.
​
-
Accounting for Thanksgiving:
Similarly, for Thanksgiving, we're creating a data frame for thanksgiving.
​​
​
​
​
​​​
-
Combining the Holiday Data:
Both holiday data frames (Easter and Thanksgiving) are combined into a single holidays data frame. This will be used later when creating the Prophet model to account for these special events.
​
-
​​Cleaning the Original Data:
Now that we've created separate data frames for the holidays, we no longer need the original 'Easter' and 'Thanksgiving' columns in our main data frame. Thus, we drop these columns to keep our data tidy.
Facebook Prophet Model
In this section, we utilize the Facebook Prophet library to construct a forecasting model. Prophet is specifically designed for forecasting time series data, offering intuitive parameters that can be easily tuned.
-
Initializing the Prophet Model:
​
​
​
We provide the holidays data frame we constructed earlier to account for Easter and Thanksgiving.
​seasonality_mode indicates how seasonality interacts with the trend. 'Multiplicative' suggests that seasonal effects are multiplicative in nature (e.g., higher sales during holidays).
​
seasonality_prior_scale, holidays_prior_scale, and changepoint_prior_scale are tuning parameters to adjust the flexibility of the model regarding seasonality, holidays, and trend changes, respectively.
​
-
Adding Additional Regressors:
Here, we're enhancing the model by incorporating extra regressors or predictors. In this case, we consider the impact of 'Christmas', 'Temperature', and 'Marketing' on the target variable.
​
​
​
-
​ Fitting the Model:
This line trains the Prophet model using the provided data in the data frame.
​
​
​
​
-
Data Split for Cross-Validation:
We calculate how many observations we have after excluding the last 180 days. This might be useful to know when setting up the cross-validation.
​
​
-
Cross-Validation:
Here, we're performing cross-validation to assess the model's predictive performance.
​
​
​
horizon: Forecast horizon is set to 31 days, meaning we're making predictions 31 days into the future.
period: Every 16 days, a new forecast is made.
initial: The initial training period is set to 2012 days.
​
-
Model Performance:
This provides various performance metrics for each forecast horizon. Metrics include RMSE, MAE, and others.
​
​
Output:
​
​
​
​
​
​
​
​
-
RMSE and MAPE:
We calculate and display the average Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) over all the cross-validation folds.
​
​
-
​ Error Analysis:
Using this plot, we visualize the RMSE across different forecast horizons. It helps in understanding if the forecast errors increase over time, giving insights into the reliability of the forecast as time progresses.
​
​
Output:
​
​
​
Inference: The spread of gray dots at each horizon point suggests variability in the RMSE values of predictions. This could imply that some predictions are quite accurate while others deviate more substantially from the actual values.
​
The blue line fluctuates but doesn't show a consistent upward or downward trend, suggesting that the average prediction error does not consistently worsen or improve as the horizon extends. The model's performance is relatively stable over time within the 31-day period visualized.
​
Facebook Prophet Parameter Tuning
Parameter tuning is a pivotal step in building machine learning models to identify the best combination of parameters that lead to the most accurate predictions.
-
Setting up Parameter Grid:
Here, we define a range of values for each parameter we'd like to tune. We then create a grid of all possible combinations of these parameter values.
​
​
​​
​
-
Looping through Parameter Combinations:
We initialize an empty list to store the Root Mean Square Error (RMSE) for each parameter combination, to determine the best performing model.
​
We loop through each combination of parameters in the grid. Inside the loop, we initialize the Prophet model with the current parameter combination and train it with the data. This involves setting up the seasonality mode, scales for seasonality, holidays, and changepoints.
​
​
​
-
​​ Cross-Validation and Error Calculation:
After fitting, we perform cross-validation to assess the model's forecasting accuracy. Post that, we calculate the average RMSE over the validation period and store it in our rmse list.
​
-
​Storing Tuning Results:
Once all combinations are evaluated, we create a Data Frame to store all parameter combinations alongside their corresponding RMSE.
​
Creating a Forecasting Template with Python
In this section, we're setting up the foundation to create a forecast using data loaded from CSV files.
​
-
Loading the Data:
We start by loading two datasets. This dataset contains historical data from the file 'nyc_data.csv'. This dataset (future_df) contains data for which we want to make future forecasts from the file 'future.csv'.
​
​
​
-
​ Merging Datasets:
We combine the historical data (df) and the future data (future_df) into one continuous dataset. Then, we reset the index to ensure it is continuous without any gaps or jumps.
​
Setting Up for Forecasting with Consideration of Holidays
In this section, we're getting our data and the best parameters ready to generate forecasts that take into account the effect of holidays.
​
-
Data Splitting:
The dataset is split into two parts.
'training' : This is the majority of our dataset and will be used to train our forecasting model.
'future_df': This subset contains the last 31 rows of our dataset. It represents the period for which we want to generate forecasts.
​
​
​
-
​ Loading Best Parameters:
We load the best model parameters from the previous tuning exercise. These parameters are stored in a CSV file named 'best_params_prophet.csv'. These parameters have shown to give the best forecasting performance for our data.
​
​
-
​Extracting Parameter Values:
​
​
​
We extract individual parameter values from the loaded data:
'changepoint_prior_scale': This parameter controls how flexible the model is in adapting to changes in the trend.
'holidays_prior_scale': Represents the flexibility of the model in adapting to holiday effects.
'seasonality_prior_scale': Controls the flexibility in seasonal effects.
'seasonality_mode': Specifies if the seasonality is additive or multiplicative.
​
Building the Forecasting Model with Facebook Prophet
After setting up our data and defining the potential influences of holidays on our target variable, we seamlessly transition into the core of our analysis: constructing a robust forecasting model using Facebook's Prophet library.
-
Initializing the Model:
Here, we initialize the Prophet model with various parameters. Prophet allows us to add additional regressors (variables) that might have an impact on our forecasts.
​
​
​
​
​
​
​
​
-
​Training the Model:
With the model initialized and regressors added, the next step is to train it on our training data.
​
​
Generating Forecasts with Prophet
-
Creating a Future Data Frame:​
Our first task is to craft a 'future' data frame. This data frame will outline the specific dates or time intervals for which we desire predictions.
The 'make_future_dataframe' method helps us generate this data frame, and we've chosen to forecast for the same duration as our 'future_df'. In this context, we're making daily ('freq = "D"') forecasts.
​​
-
Generating Predictions:
With our future data frame in hand, we invoke Prophet's predict method to obtain forecasts. This produces a new data frame, named forecast, populated with various prediction-related metrics.
​
​
​
-
​Visualizing Component Influences:
Prophet also enables us to understand the factors driving our predictions. Using 'plot_components', we generate a visualization that deciphers the impact of trends, weekly seasonality, yearly seasonality, and the holidays on our target.
​
​
​
​
Output:
​
​
​
​
​
​
​
​
SARIMAX
In this part of our analysis, we're setting up for an advanced time series forecasting using SARIMAX, which stands for Seasonal AutoRegressive Integrated Moving Average with eXogenous regressors model. SARIMAX is a powerful model that can account for complex patterns in time series data such as trends, seasonality, and the influence of external factors
-
Import Libraries
We start by importing necessary Python libraries that will be used throughout our analysis.
​
​
​
​
​
​
​​
​
-
Load Data
We then load our dataset into a pandas Data Frame. The data is expected to have dates in 'YYYY-MM-DD' format, which we specify to be parsed as datetime objects.
​
​
​​
​​
​
-
Renaming the Target Variable
We then rename the column representing the target variable, which in our case is 'Demand', to 'y'. This is a common convention when working with time series forecasting, as many functions and methods expect the target variable to be named 'y'.
​
​
​
-
Extract Regressors
If there are additional variables that could help explain fluctuations in the target variable (i.e., exogenous variables), we extract them into a separate DataFrame called X. This will later be used as input to the SARIMAX model along with the target variable 'y'.
​
​
Data Stationarity
Transitioning from the foundational steps of loading and preparing our data, we now pivot to a critical aspect that underpins the reliability of our time series forecasting: stationarity.
It’s essential to ensure that our time series data is not influenced by trends or seasonality that could mislead our analysis. This step is not merely a precaution; it's a cornerstone for models that assume a consistent mean and variance over time, like SARIMAX.
​
-
Augmented Dickey-Fuller Test
We begin by deploying the Augmented Dickey-Fuller test, a statistical procedure that speaks volumes about the time series' stationarity — a prerequisite for the SARIMAX model we aim to employ.
​
-
Differencing: The Path to a Stationary Series
To rectify non-stationarity, we resort to differencing, a technique where we subtract the current observation from the previous one. It's a simple yet effective remedy for removing trends and cycles.
​
​
​
​
​
​
​
Output:
​
​
Once stationarity is confirmed, we can proceed, knowing our model has a solid foundation to provide accurate forecasts.
​
Refining Predictions with the SARIMAX Model
Having ensured our data's stationarity, we advance to the core of our forecasting journey—the SARIMAX model.
​
-
Crafting the SARIMAX Model
We define our SARIMAX model parameters with care. These parameters are not arbitrary; they are informed by our data's unique characteristics and prior diagnostics. For daily data with potential weekly patterns, we incorporate a seasonal component that reflects this.
​
​
​
​
Here, (1, 1, 1) represents the non-seasonal ARIMA component (autoregressive, differencing, and moving average), and (1, 1, 1, 7) encapsulates the seasonal aspect, capturing the essence of weekly seasonality with a periodicity of 7 days.
​
-
Validation with Cross-Validation
To validate our model, we implement a rolling forecast cross-validation. This technique simulates a series of train-and-test scenarios that mimic how we'd predict future data points.
​
​
Our choice of a 31-day horizon (h=31) is deliberate, aligning with the common business need for a month-ahead forecast.
​
-
Performance Evaluation
We're scrutinized its performance through the lens of mean squared error, a metric that accentuates any large deviations in our predictions.
​
​
​
​
​
Calculating the square root of the mean squared error gives us the root mean square error (RMSE), an intuitive measure of our model's accuracy.
​
​
Fine-Tuning SARIMAX Parameters
After establishing the foundational structure of our SARIMAX model, we move onto the critical stage of parameter tuning. This process is akin to fine-tuning an instrument, ensuring each note resonates with clarity and precision. In the realm of time series forecasting, such precision is pivotal for achieving the most accurate predictions.
-
Exploring the Parameter Space
To commence, we define a grid of parameters that the SARIMAX model will explore. The grid encompasses variations of the order of autoregressive (AR), differencing (I), and moving average (MA) components, alongside their seasonal counterparts.
​
​
​
​
​
​
​​
​
-
Iterative Model Tuning
For each set of parameters, a new SARIMAX model is specified and assessed using cross-validation. This methodical approach is not only thorough but necessary to unearth the optimal configuration.
​
​
​
​
​
​
​
​
​
Each iteration captures the performance of the model variant via RMSE, a decisive measure that will guide our selection process.
​
-
Synthesizing Tuning Results
Finally, we collate the performance metrics of each parameter combination into a comprehensive data frame. This step transforms raw data into insightful, actionable information.
​
​
​
Output:
​
​
Implementing SARIMAX Forecasting
In this pivotal section, we transition from preparation to execution, employing the Seasonal Autoregressive Integrated Moving Average with eXogenous factors (SARIMAX) model to project future trends based on historical data and identified best parameters.
​
-
Optimizing Parameters for Robust Forecasting
Our first task is to fetch the pre-determined optimal parameters for the SARIMAX model.
​
​
​​
​
-
Constructing the SARIMAX Model
With the parameters in hand, we construct our model.
​
​
​
​
We set the model's order and seasonal order using the optimized parameters, alongside our regressors (train_X). The model is tailored to our specific time series characteristics, capturing daily patterns (as indicated by the seasonal order's last parameter, 7, for a weekly cycle).
​
-
Training the Model
Next, we commence training. This step is akin to calibrating our instruments, ensuring the model has learned from the past and is ready to predict the future.
​
​
​​
​
-
Projecting into the Future
We now venture into the core of forecasting. The model proceeds to predict future values, utilizing the date ranges provided by 'future_df' and incorporating the external regressor data from 'future_X' to enhance the accuracy of its forecasts.
​
​
​
​​
​
-
Visualizing the Forecast
To bring the results to life, we craft a visual narrative.
​
​
​
​
Output:
​
​
​
​
​
​
​
​
​
​
​
​
The graph presents a clear visual story of our business's past and projected future demand. The blue line shows actual demand up to the present, reflecting the real numbers we've recorded. It's a history of the highs and lows, each peak and dip telling a story of customer choices and market conditions.
​
The orange line is where we look ahead. Based on past patterns, this line is our forecast for the next period, suggesting what demand might look like if current trends continue. It' smoother than the historical data because it represents an average of expected outcomes, not the day-to-day variations.
LinkedIn Silverkite
LinkedIn Silverkite Forecasting offers a comprehensive approach to data forecasting, with various data inputs and functions that allow for both automated and customized modeling. It incorporates machine learning techniques and provides visualizations for a holistic understanding of data trends.
-
Importing Necessary Libraries
To commence with the forecasting process, we import a combination of Python data manipulation, visualization, and Silverkite-specific libraries.
​
​
​
​
​
​
​
​​
-
Loading the Dataset
Our next step revolves around loading the datasets we'll utilize for our forecast.
​
​
​
-
Preparing the Data
Subsequent to loading, we integrate the two datasets into one consolidated data frame and perform some basic data manipulations.
​
​
​
Finally, for the sake of clarity and consistency, we rename columns to fit the naming conventions required by our forecasting tools.
​
​
Preparing for Silverkite Forecasting
-
Specifying Metadata
First and foremost, we define the fundamental metadata for our time series data. This includes the column names and the frequency of our data, which in our case is daily.
​
​
​
​
-
Setting Growth Terms
Here, we're specifying potential growth terms that our model might consider.
​
​
​​
​
-
Configuring Seasonalities
Silverkite can automatically detect and adjust for various types of seasonalities, from yearly to daily. We've set all of them to "auto" for automatic detection.
​
​
​
​
-
Incorporating Holidays and Special Events
Silverkite provides the flexibility to factor in holidays from different countries and specify custom events.
​
​
​
​
​
​
​
​
​
-
Designating Changepoints
Changepoints allow the model to adjust to sudden changes in trends. We've set the method to "auto" for automatic changepoint detection.
​
​
​
-
Using External Regressors
Regressors are external factors or variables that can influence the target variable. Here, we specify some potential regressors.
​
​
​
-
Incorporating Lagged Regressors
Lagged regressors help the model consider past values of external factors.
​
​
​
​
​
-
Applying Auto Regression
Auto regression allows the model to predict a value based on its past values. We've set it to "auto".
​
​
​
-
Selecting Fitting Algorithms
Lastly, Silverkite supports various algorithms to fit the model. We specify a list of potential algorithms.
​
​
​
​
​
Building and Evaluating the Silverkite Forecasting Model
-
Model Construction
Having specified the individual components in the preparation phase, we now assemble them to create a cohesive Silverkite forecasting model.
​
​
​
​
​
​
​
​
-
Setting Up Cross-Validation
To gauge the performance of our forecasting model, we employ a cross-validation approach. Here's how we've structured it.
​
​
​
​
-
Defining Evaluation Metrics
The metrics will give us a quantitative measure of how well our model performs. We're using the Root Mean Squared Error (RMSE) for this.
​
​
​
​
​
-
Configuration for Forecasting
By consolidating everything from metadata, components, cross-validation settings, and evaluation metrics, we build the complete configuration for forecasting.
​
​
​
​
​
​​
-
Executing the Forecast
Using the configuration, we initialize a Forecaster and run the forecast.
​
​
​
​​
​​
Analyzing Parameter Tuning Results for Silverkite Model
-
Summarizing Cross-Validation Results
To refine our Silverkite forecasting model, we have employed a grid search that exhaustively tries out different combinations of parameters to find the best fit. Once the cross-validation process is complete, we can summarize the results.
​
​
​​
​
​
-
Organizing Results
For ease of interpretation, we'll reorganize the results into a more readable format by setting the parameter combinations as the index.
​
​
​
​
-
Identifying Top Performers
With the results clearly laid out, we turn our attention to the combinations that yield the best performance, specifically those with the lowest RMSE as it is our selected performance metric.
​
​
​
​
​
Retrieving Optimal Parameters for Silverkite Forecast
In the quest to generate the most accurate forecasts possible using the Silverkite model, we've completed a thorough parameter tuning process. Our next step involves integrating the best parameters identified into our forecasting model.
-
Loading the Best Parameters
Firstly, we'll load the CSV file that contains the parameters that were determined to be the most effective during the cross-validation phase.
​
​​
​
​
-
Extracting Parameter Values
After loading the parameters, we extract the specific values that we'll use to configure our Silverkite model for the final forecast.
​
​
​
​
Tailored Configurations
The structure and objectives remain largely the same as previous steps, ensuring a coherent approach throughout the forecasting setup.
The specific changes implemented here are the integration of a YAML-parsed fitting algorithm and the explicit definition of the growth term using the optimal parameter previously determined.
-
Growth Configuration:
Unlike the earlier generalized approach, here we specifically set the growth term based on previously identified best parameters to guide the model’s trend component.​
​
-
Custom Fitting Algorithm:
The sole divergence in this step is the introduction of YAML parsing for the fitting algorithm parameter. This allows the model to interpret the optimal machine learning algorithm for fitting the data, as determined from prior tuning results.
​
​
​
Silverkite Model Forecast
-
Model Assembly
We synthesize the model by combining all the individual components: growth, seasonality, events, changepoints, regressors, lagged regressors, autoregression, and the custom fitting algorithm.
​
​
​
​
​
​
​
-
Cross-Validation Setup
Here, we fine-tune the cross-validation (CV) strategy by specifying the minimum number of training periods, enabling an expanding window approach for CV, and determining the number and spacing of the CV splits. The number of max splits has been updated to 50, allowing for more granularity and robustness in the validation process.​
​
​
​
​
​
​​
​
-
Evaluation Metrics
The Root Mean Squared Error (RMSE) is chosen as the selection metric for CV, focusing on minimizing forecast errors.
​
​
​
​
​​
​
-
Forecast Configuration:
This encapsulates all the set parameters and components, establishing the blueprint for the Silverkite forecast model.
​
​
​
​
​
​
​​
​​
-
Model Execution and Summary:
The Forecaster object is instantiated, and the configured forecast is executed on the dataset. We print out the summary of the final estimator in the pipeline for an in-depth review of the model.
​
​
​
​
​
-
Visualization of Components
The model's various components are visualized to provide insight into the contribution of trends, seasonality, and events to the forecast.
​
​
​
​
Output:
Long Short-Term Memory
Long Short-Term Memory (LSTM) is a specialized type of recurrent neural network (RNN) architecture that has gained significant popularity in the realm of time series forecasting and sequence prediction. Unlike traditional statistical models such as ARIMA, SARIMA, and machine learning models like Facebook Prophet and LinkedIn Silverkite, LSTM is a deep learning model specifically designed to capture patterns and dependencies in sequential data.
​
Preparing Time Series Data and Covariates for LSTM
In this section, we prepare the data for training a Long Short-Term Memory (LSTM) model. Here's a step-by-step guide to what's being done:
​
-
Time Series Object Preparation:
Start by creating a time series object called series from the dataset. This series contains the primary time-dependent data for modeling. Additionally, there's a separate time series object called covariates that includes other relevant data from the Data Frame, excluding the timestamp or date information.
​
​
​​​​
-
Time Series Attributes:
To capture various aspects of time-related patterns, additional time series attributes are created
​
​
​
​
​
​
​
​
​
-
Data Transformation:
Prepare scalers used for normalizing the data. Normalization is a critical step in machine learning to ensure that all data falls within the same numerical range.
​
​
​
​​
-
Scaling Time Series (Y):
Stack the covariates data to match the time steps of the main time series. This ensures that the covariate data aligns with the time series data.​
​
​
​
​
-
Scaling Covariates:
Apply a scaler (transformer2) to normalize the covariates. This step ensures that all data used for modeling falls within a consistent numerical range.
​
-
Stacking Covariates with Attributes:
Stack the covariates along with the generated month_series and weekday_series. This allows incorporation of one-hot encoded month and weekday attributes into the covariate data.
​
​
​
​
​
LSTM Model Building
In this section, we delve into constructing the Long Short-Term Memory (LSTM) model, a powerful tool for time series forecasting. Let's break down what's happening:
-
Model Initialization:
A crucial step is defining the LSTM model. We configure it with the following parameters:
​​
​
​
​
​
​
​
​
​
​
-
​ Model Training:
We proceed to train the LSTM model using the prepared data. Here's the code for fitting the model:
​
​
​
Cross Validation for LSTM Model
In this section, we delve into constructing the Long Short-Term Memory (LSTM) model, a powerful tool for time series forecasting. Let's break down what's happening:
​
-
Setting Up Cross-Validation:
In this section, we set up the cross-validation process to evaluate our LSTM model's forecasting capabilities. The objective is to assess how well the model performs on unseen data.
​
​
​
​
​
​
-
Calculating RMSE (Root Mean Squared Error):
To evaluate the performance of our LSTM model, we calculate the Root Mean Squared Error (RMSE) between the predicted and actual values.
​
​
​
​
​
​
​
​
​
​
​
​
​
Parameter Tuning for LSTM Model
Tuning model parameters is a crucial step to optimize the performance of our LSTM model. Here's a step-by-step breakdown of what's happening:
​
-
Defining Parameter Grid:
We start by specifying a grid of hyperparameters to explore. These hyperparameters influence the behavior and performance of our LSTM model.
​
​
​
​
​
​
​
The grid covers various aspects of the model, such as the number of recurrent layers, hidden layer dimensions, dropout rates, training epochs, and more.
​
-
Parameter Tuning Loop:
In this section, we go through a loop that assesses the model's performance for different parameter settings.
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​​
​​
​​
​​
​​
​​
​​
​
​
​
​
​
​
​
​
​
​
​
-
Analyzing the Results:
After completing the parameter tuning loop, we can analyze the results to identify the best hyperparameters.
​
​
​
​
​
-
Get Best Parameters:
We extract the values of the best parameters. These values can be used to configure the final LSTM model for optimal forecasting performance.
​
​
​
​
Parameter Tuning Continues for LSTM Model
Continuing from the previous section, we refine the hyperparameter tuning process. Here's an explanation of what's happening:
​
-
Defining an Extended Parameter Grid:
We further explore the hyperparameter space by specifying a new parameter grid that includes various options for hyperparameters like learning rate, training length, and input chunk length.
​
​
​
​
​
​
-
​ Creating an Extended Parameter Grid:
We generate a new list of parameter combinations based on the extended grid.
​
​
​
This code gives us an overview of the total number of parameter combinations to be evaluated.
​
-
Extended Parameter Tuning Loop:
Just like before, we loop through the extended parameter grid to evaluate the model's performance with different hyperparameters. This process is a continuation of the parameter tuning loop, but now it explores an extended set of hyperparameters.
​
​
​
​
​
​
​
​
​
​
​
​
​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​​
​
​
-
Analyzing the Results:
After completing the extended parameter tuning loop, we analyze the results to identify the best hyperparameters. This code compiles the RMSE values for all extended parameter combinations.
​
​
​
​
LSTM Forecasting Model
In this section, we utilize the best hyperparameters obtained from our parameter tuning process to create forecasts using the LSTM model. Here's a step-by-step explanation of the code:
​
-
Retrieving Best Parameters:
We start by retrieving the best-performing hyperparameters from the CSV file where we stored them. These parameters were determined to result in the most accurate LSTM model for our specific dataset.
​
-
Extracting Parameters:
We then extract these best hyperparameters for building the LSTM model. These parameters are used to configure the LSTM model for accurate predictions.
​
​
​
​
​
​
-
LSTM Model Configuration:
We define the LSTM model with the best parameters.
​
​
​
​
​
​
​
​
​
-
Model Fitting:
We fit the LSTM model with the training data and covariates to prepare it for making predictions.
​
​
​
This step ensures that the model is trained on historical data and is ready to forecast future values.
​
-
Generating Predictions:
Finally, we generate predictions for the future using the LSTM model. inverse_transform is used to reverse this transformation, bringing the predictions back to their original scale.
​
Ensemble
Ensemble learning is a powerful and widely used technique in machine learning that leverages the idea that combining multiple models can often lead to more accurate and robust predictions compared to individual models. This approach aims to reduce the risk of overfitting, improve generalization, and enhance the overall performance of machine learning algorithms.
​
Combining Predictions and Weights
In this section, we focus on creating an ensemble of predictions from various time series forecasting models, such as Prophet, SARIMA-X, Silverkite, and LSTM. The goal is to leverage the strengths of each model to improve the overall forecasting accuracy. Here's what we are doing:
​
-
Importing Predictions:
We begin by importing the predictions made by each of the four models: Prophet, SARIMA-X, Silverkite, and LSTM. These predictions are stored in separate CSV files.
​
​​
​​
-
Selecting Relevant Data:
After importing the predictions, we extract the essential columns we need for the ensemble. These columns include the timestamp ("ds") and the model-specific predictions.
​
​
​
-
Calculating Errors:
To assign weights to each model, we calculate the mean squared error (MSE) for their respective forecasts. The MSE is a measure of prediction accuracy, with lower values indicating better performance.
​
​
​
​
​
-
Determining Weights:
Next, we calculate the weights for each model based on their MSE values. These weights determine the influence of each model in the ensemble. We normalize the weights such that they sum to 1. The weights are inversely proportional to the model's error. Models with lower error receive higher weights.
The goal is to give more importance to models with better predictive performance.
​
​
​
​
​
​
​
​​
Output:
​
​
​
​
​
-
Verifying Weights:
We calculate an "extra weight" to ensure that the sum of all weights equals 1. This step is essential to guarantee that the ensemble's predictions are correctly weighted.
​​
​
​
Ensemble Forecast
In the following section, we continue our journey to create an ensemble forecast by skillfully blending predictions from multiple forecasting models.
-
Ensemble Formula:
The weights assigned to each model, which were determined in the previous section, are used to adjust the contribution of each model. The ensemble formula is designed to give more influence to models with better predictive performance.​
​
​
​
​
​
-
Ensemble Visualization:
After generating the ensemble forecast, we visualize the results by plotting the forecasted values alongside the actual data.​
​
​
​
​
Output:
​
​
​
​
​
​
​
​
​
​
​
​
This graph shows forecasts from five different models for a variable over the month of January 2021. Each model captures a similar pattern of peaks and valleys, which may indicate a regular trend in the data, like daily sales fluctuations. The ensemble line, which combines insights from all models, suggests that we can potentially increase the accuracy of our predictions by considering multiple models instead of relying on just one.