Random forest model
4) Apply the Random Forest Learner (Regression) to the output port of the Lag Column node. Make sure your target is cluster_26 and your inputs are the lagged values: cluster_26(-n)
5) Create a separate branch from the output of the Lag Column node, with a Top k Row Filter node. Sort by row ID in descending, so that the last time point is on the first row. Keep the first row only of the sorted table. Send the output to the Recursive Loop Start node.
6). The Recursive Loop Start node sends an updated table of the target and the predictors to the process component. The process component shifts the lag columns by 1 time point, and add the target column (cluster_26) as the last time point (cluster_26(-1)).
7) Add a Random Forest Predictor (Regression) node. Add the trained model from the Random Forest Learner (Regression) node to the top input port, and the output from the process component to the bottom input port. Change the prediction column name to cluster_26.
8) From the prediction table generated at the output of the Random Forest Predictor (Regression) node, remove the column cluster_26 (Prediction Variance) with a Column Filter node.
9) Send the output of the Column Filter node from step 8) to both input ports of the Recursive Loop End node. Set the maximum number of iterations to 168 hours (or 1 week).
10) In the output of the Recursive Loop End node, rename the column cluster_26 to Forcasts with a Column Renamer node.
11) The model evaluation is similar to that of the SARIMA model exercise. Both prediction and original data tables are renumerated (Row ID node) and combined (Joiner node). Then various evaluation metrics are calculated (Numeric Scorer node). The forecasts and the original data are plotted (Line Plot node).