Icon

05_​Model_​Optimization

Hyper Parameter Optimization - Exercise

This workflow optimizes the parameters of a machine learning model that predicts the residual of time series (energy consumption). The residual of time series is what is left after removing the trend and first and second seasonality. The optimized parameters are the number of trees and tree depth in a Random Forest model.



Data Loading Data Preparation ACF Plot & seasonalityremoval Hyper Parameter Optimization Time Series Analysis05. Hyper Parameter OptimizationSummary:In this exercise we'll optimize some of the hyper parameters in our Random Forestmodel.Instructions:1) Run the workflow up through the Random Forest Predictor, we'll start from here2) Attach a Numeric Scorer to the output of the Predictor, verify the reference andprediction columns are correct in the configuration. Check the Attach output scores asflow variables option, we'll need these scores as flow variables later to select the bestparameters.3) Next we'll add the Parameter Optimization Loop Start node to our workflow. It's outputis a flow variable port. Attach this to the Random Forest Learner.4) To configure the Parameter Optimization Loop Start we'll add new variables to thetable in its configuration. These will represent the range of values we want to try whentraining.Create one with the name: NumTrees, with min value 5, max value 100, step size 5Create another with the name: TreeDepth with min value 1, max value 20, step size 1Check the box to indicate both are integers**Execute this node so you see your Flow Variables in the next step.5) Next configure the Random Forest Learner to use these flow variables. Open theconfiguration window for the Learner and go to the Flow Variables tab.In the drop down box next to maxLevels select your TreeDepth flow variable, and in thebox next to nrModels select NumTrees. This will instruct KNIME to control those modelparameters with your flow variables.6) Finally add the Parameter Optimization Loop End to the end of your workflow. Attachthe flow variable output of your Numeric Scorer node to it.In the configuration window for the Loop End node you can select which metric tooptimize for. We'll use Mean Absolute Percentage Error.Optional) Train a model with the optimized parameters from the loop convertdate/timeinto Date&Time objectssubstuting missing values with average ofprevious and nextIntroducemissinghoursEnergyusagedataPartition fromtop down fortime seriesdata10 previous hours Decompose Signal String to Date&Time ImputingMissing Values Column Filter Timestamp Alignment CSV Reader Partitioning Random Forest Learner(Regression) Random Forest Predictor(Regression) Lag Column Data Loading Data Preparation ACF Plot & seasonalityremoval Hyper Parameter Optimization Time Series Analysis05. Hyper Parameter OptimizationSummary:In this exercise we'll optimize some of the hyper parameters in our Random Forestmodel.Instructions:1) Run the workflow up through the Random Forest Predictor, we'll start from here2) Attach a Numeric Scorer to the output of the Predictor, verify the reference andprediction columns are correct in the configuration. Check the Attach output scores asflow variables option, we'll need these scores as flow variables later to select the bestparameters.3) Next we'll add the Parameter Optimization Loop Start node to our workflow. It's outputis a flow variable port. Attach this to the Random Forest Learner.4) To configure the Parameter Optimization Loop Start we'll add new variables to thetable in its configuration. These will represent the range of values we want to try whentraining.Create one with the name: NumTrees, with min value 5, max value 100, step size 5Create another with the name: TreeDepth with min value 1, max value 20, step size 1Check the box to indicate both are integers**Execute this node so you see your Flow Variables in the next step.5) Next configure the Random Forest Learner to use these flow variables. Open theconfiguration window for the Learner and go to the Flow Variables tab.In the drop down box next to maxLevels select your TreeDepth flow variable, and in thebox next to nrModels select NumTrees. This will instruct KNIME to control those modelparameters with your flow variables.6) Finally add the Parameter Optimization Loop End to the end of your workflow. Attachthe flow variable output of your Numeric Scorer node to it.In the configuration window for the Loop End node you can select which metric tooptimize for. We'll use Mean Absolute Percentage Error.Optional) Train a model with the optimized parameters from the loop convertdate/timeinto Date&Time objectssubstuting missing values with average ofprevious and nextIntroducemissinghoursEnergyusagedataPartition fromtop down fortime seriesdata10 previous hours Decompose Signal String to Date&Time ImputingMissing Values Column Filter Timestamp Alignment CSV Reader Partitioning Random Forest Learner(Regression) Random Forest Predictor(Regression) Lag Column

Nodes

Extensions

Links