Icon

04_​Machine_​Learning

Machine Learning - Exercise

This workflow predicts the residual of time series (energy consumption) by machine learning models that use lagged values as predictors. The residual of time series is what is left after removing the trend and first and second seasonality.



Data Loading Data Preparation ACF Plot & seasonalityremoval Model TrainingCreate 2 Machine Learning models, Random Forest (Regression), and LinearRegression Note! The "Residual" column shows the timeseries after removing thetrend and first and secondseasonality. In the followingyou build models to predictvalues in this column. Time Series Analysis04. Machine LearningSummary:In this exercise we'll train and score a Random Forest and Linear RegressionInstructions:1) Run the workflow up through the Decompose Signal component, we’ll start thisexercise from here2) Use the Lag Column node with Lag Interval = 1 and Lags = 10. We'll use these 10past values as the inputs for our models.3) Partition the data using the Partioning node. Let’s use an 80/20 split. Make sureyou check the box to take data from the top. This is important with time series data. 4) Apply the Random Forest Learner (Regression) and optionally the LinearRegression Learner to the top port of the Partioning node. Make sure your target isResidual and your inputs are the lagged values: Residual(-n)5) Use the Random Forest Predictor (Regression) (and the Regression Predictor)nodes after their respective learners. Use the data from the bottom port of ourPartitioning node for the input.6) Apply the Numeric Scorer node to the output of both predictors and see how theydid convertdate/timeinto Date&Time objectssubstuting missing values with average ofprevious and nextIntroducemissinghoursEnergyusagedata Decompose Signal String to Date&Time ImputingMissing Values Column Filter Timestamp Alignment CSV Reader Data Loading Data Preparation ACF Plot & seasonalityremoval Model TrainingCreate 2 Machine Learning models, Random Forest (Regression), and LinearRegression Note! The "Residual" column shows the timeseries after removing thetrend and first and secondseasonality. In the followingyou build models to predictvalues in this column. Time Series Analysis04. Machine LearningSummary:In this exercise we'll train and score a Random Forest and Linear RegressionInstructions:1) Run the workflow up through the Decompose Signal component, we’ll start thisexercise from here2) Use the Lag Column node with Lag Interval = 1 and Lags = 10. We'll use these 10past values as the inputs for our models.3) Partition the data using the Partioning node. Let’s use an 80/20 split. Make sureyou check the box to take data from the top. This is important with time series data. 4) Apply the Random Forest Learner (Regression) and optionally the LinearRegression Learner to the top port of the Partioning node. Make sure your target isResidual and your inputs are the lagged values: Residual(-n)5) Use the Random Forest Predictor (Regression) (and the Regression Predictor)nodes after their respective learners. Use the data from the bottom port of ourPartitioning node for the input.6) Apply the Numeric Scorer node to the output of both predictors and see how theydid convertdate/timeinto Date&Time objectssubstuting missing values with average ofprevious and nextIntroducemissinghoursEnergyusagedata Decompose Signal String to Date&Time ImputingMissing Values Column Filter Timestamp Alignment CSV Reader

Nodes

Extensions

Links