Icon

04_​Machine_​Learning

Machine Learning - Exercise (Solution)

This workflow predicts the residual of time series (energy consumption) by machine learning models that use lagged values as predictors. The residual of time series is what is left after removing the trend and first and second seasonality.




Time Series Analysis04. Machine LearningSummary:In this exercise we'll train and score a Random Forest and Linear RegressionInstructions:1) Run the workflow up through the Decompose Signal component, we’ll start thisexercise from here2) Use the Lag Column node with Lag Interval = 1 and Lags = 10. We'll use these 10past values as the inputs for our models.3) Partition the data using the Partioning node. Let’s use an 80/20 split. Make sureyou check the box to take data from the top. This is important with time series data. 4) Apply the Random Forest Learner (Regression) and optionally the LinearRegression Learner to the top port of the Partioning node. Make sure your target isResidual and your inputs are the lagged values: Residual(-n)5) Use the Random Forest Predictor (Regression) (and the Regression Predictor)nodes after their respective learners. Use the data from the bottom port of ourPartitioning node for the input.6) Apply the Numeric Scorer node to the output of both predictors and see how theydid Data Loading Data Preparation ACF Plot & seasonalityremoval Model TrainingCreate 2 Machine Learning models, Random Forest (Regression), and LinearRegression Note! The "Residual" column shows the timeseries after removing thetrend and first and secondseasonality. In the followingyou build models to predictvalues in this column. Save test/seed data and model for comparison and deployment convertdate/timeinto Date&Time objectssubstituting missing values with average ofprevious and next10 previous hoursIntroducemissinghoursEnergyusagedataPartition fromtop down fortime seriesdatanrows-720seed dataRandomForest.modelLinReg.pmmltrendDeployment.tableseed.tablevar.table String to Date&Time ImputingMissing Values Lag Column Column Filter Timestamp Alignment CSV Reader Partitioning Linear RegressionLearner RegressionPredictor Random Forest Predictor(Regression) Random Forest Learner(Regression) Numeric Scorer Numeric Scorer Extract TableDimension Math Formula(Variable) Row Filter Merge Variables Model Writer PMML Writer PMML Writer Variable toTable Row Decompose Signal Table Writer Table Writer Table Writer Time Series Analysis04. Machine LearningSummary:In this exercise we'll train and score a Random Forest and Linear RegressionInstructions:1) Run the workflow up through the Decompose Signal component, we’ll start thisexercise from here2) Use the Lag Column node with Lag Interval = 1 and Lags = 10. We'll use these 10past values as the inputs for our models.3) Partition the data using the Partioning node. Let’s use an 80/20 split. Make sureyou check the box to take data from the top. This is important with time series data. 4) Apply the Random Forest Learner (Regression) and optionally the LinearRegression Learner to the top port of the Partioning node. Make sure your target isResidual and your inputs are the lagged values: Residual(-n)5) Use the Random Forest Predictor (Regression) (and the Regression Predictor)nodes after their respective learners. Use the data from the bottom port of ourPartitioning node for the input.6) Apply the Numeric Scorer node to the output of both predictors and see how theydid Data Loading Data Preparation ACF Plot & seasonalityremoval Model TrainingCreate 2 Machine Learning models, Random Forest (Regression), and LinearRegression Note! The "Residual" column shows the timeseries after removing thetrend and first and secondseasonality. In the followingyou build models to predictvalues in this column. Save test/seed data and model for comparison and deployment convertdate/timeinto Date&Time objectssubstituting missing values with average ofprevious and next10 previous hoursIntroducemissinghoursEnergyusagedataPartition fromtop down fortime seriesdatanrows-720seed dataRandomForest.modelLinReg.pmmltrendDeployment.tableseed.tablevar.table String to Date&Time ImputingMissing Values Lag Column Column Filter Timestamp Alignment CSV Reader Partitioning Linear RegressionLearner RegressionPredictor Random Forest Predictor(Regression) Random Forest Learner(Regression) Numeric Scorer Numeric Scorer Extract TableDimension Math Formula(Variable) Row Filter Merge Variables Model Writer PMML Writer PMML Writer Variable toTable Row Decompose Signal Table Writer Table Writer Table Writer

Nodes

Extensions

Links