Icon

04_​Machine_​Learning

Machine Learning - Exercise (Solution)

This workflow predicts the residual of time series (energy consumption) by machine learning models that use lagged values as predictors. The residual of time series is what is left after removing the trend and first and second seasonality.

URL: All you need is ... the Lag Column Node! https://www.knime.com/blog/all-you-need-is-the-lag-column-node
URL: The Lag Column Node https://youtu.be/pR_7pIEqW-c
URL: Slides on the KNIME Website https://www.knime.com/form/material-download-registration

Time Series Analysis 04. Machine LearningSummary: In this exercise we'll train and score a Random Forest and Linear Regression Instructions:1) Run the workflow up through the Decompose Signal component, we’ll start this exercise from here 2) Use the Lag Column node with Lag Interval = 1 and Lags = 10. We'll use these 10 past values as the inputs for our models. 3) Partition the data using the Partioning node. Let’s use an 80/20 split. Make sure you check the box to take data from the top. This is important with time series data. 4) Apply the Random Forest Learner (Regression) and optionally the Linear Regression Learner to the top port of the Partioning node. Make sure your target is Residual and your inputs are the lagged values: Residual(-n) 5) Use the Random Forest Predictor (Regression) (and the Regression Predictor) nodes after their respective learners. Use the data from the bottom port of our Partitioning node for the input. 6) Apply the Numeric Scorer node to the output of both predictors and see how they did
Data Loading
Data Preparation
ACF Plot & seasonality removal

Linear regression

Random forest

convertdate/timeinto Date&Time objects
String to Date&Time
Autocorrelation Plot (Labs)
Missing Value
Line Plot
Joiner
process
Partition from top down for time series data
Table Partitioner
Recursive Loop End
Line Plot
Column Filter
Recursive Loop Start
Recursive Loop Start
10 previous hours
Lag Column
Top k Row Filter
Column Renamer
Top k Row Filter
RowID
Remove variancecolumn
Column Filter
Recursive Loop End
RowID
Energy usage data
CSV Reader
Joiner
Regression Predictor
Numeric Scorer
Numeric Scorer
process
Linear Regression Learner
Random Forest Predictor (Regression)
Random Forest Learner (Regression)
RowID
RowID
Date&Time Aligner (Labs)
Column Renamer

Nodes

Extensions

Links