Auto ARIMA Learner

Trains AutoRegressive Integrated Moving Average (ARIMA) models and returns the best model according to the search criterion (AIC, BIC) within the provided constraints (max p,d,q). ARIMA model captures temporal structures in time series data in the following components:
- AR: Relationship between the current observation and a number (p) of lagged observations
- I: Degree (d) of differencing required to make the time series stationary
- MA: Time series mean and the relationship between the current forecast error and a number (q) of lagged forecast errors

Additionally, coefficent statistics and residuals are provided as table outputs.
*Note that the (p,d,q) values of the selected model can be found in the model summary output table.

Model Summary metrics:
RMSE (Root Mean Square Error)
MAE (Mean Absolute Error)
MAPE (Mean Absolute Percentage Error)
*will be missing if zeroes in target
R2 (Coefficient of Determination)
Log Likelihood
AIC (Akaike Information Criterion)
BIC (Bayesian Information Criterion)

Note: This component requires a Python environment with StatsModels package installed. In this blog post we explain how to setup the KNIME Python extension:
https://www.knime.com/blog/setting-up-the-knime-python-extension-revisited-for-python-30-and-20

Required extensions:
KNIME Python Integration
(https://hub.knime.com/knime/extensions/org.knime.features.python2/latest)
KNIME Quick Forms
(https://hub.knime.com/knime/extensions/org.knime.features.js.quickforms/latest)

Options

Target Column
The numeric column to fit the model.
Max AR Order
The maximum number of lagged observations to be used in the model.
Max I Order
The maximum number of times to apply differencing before training the model.
Max MA Order
The maximum number of lagged forecast errors to be used in the model.
Estimation Method
The log likelihood to maximize. The default method is css-mle.%%00010%%00010css-mle: maximize the conditional sum of squares likelihood and use these values as starting values to compute the exact likelihood via the Kalman filter.%%00010mle: maximize the exact likelihood via the Kalman filter.%%00010css: maximize the conditional sum of squares likelihood.
Search Criterion
Criterion to be used for model selection. If AIC is selected, then the model with the lowest AIC value will be selected.

Input Ports

Icon
Table containing numeric target column to fit the ARIMA model.

Output Ports

Icon
ARIMA model.
Icon
Table containing the selected ARIMA (p,d,q) model, coefficient statistics, and the following evaluation metrics of the ARIMA model: RMSE MAE MAPE R2 Log Likelihood AIC BIC
Icon
Table containing the residuals.

Nodes

Extensions

Links