0 ×

06_​H2O_​GBM_​parameter_​optimization

Workflow

H2O Parameter Optimization
This workflow shows how to use Parameter Optimization in combination with H2O. In the example we train multiple GBM models using brute force grid search and use the optimal parameters to train the final model.
H2O machine learning parameter optimization grid search
Parameter Optimization with H2OThis tutorial shows how to train multiple H2O Models in KNIME using parameter optimization (grid search) and extract the optimal algorithm settings for the training ofthe final model. We will train Gradient Boosting Machines for binominal classification using a grid of two different GBM parameters.1. Prepare:Load and Import data to H2O.2. Optimization:To train models with parameter optimization, we create a Loop using the KNIME Node "Parameter Optimization Loop Start" (Analytics - Mining). In this Nodes' settingswe can define the optimization grid: For this example we will optimize the GBM algorithm parameters "Number of trees" and "Max tree depth". We use brute forceoptimization, meaning that there will be as many iteration as there are parameter combinations defined in the Parameter Optimization Loop Start Node. The "LoopEnd" Node collects the scored metrics of all optimization loop iterations. In order to extract the optimal algorithm parameters, we sort the collected rows by severalmetrics and filter the top row. 3. Learn Models, do prediction and scoring in Parameter Optimization Loop:For each combination of parameters, a GBM Model is build by H2O using the "Number of Trees" and "Max tree depth" parameters of the corresponding loop iterationand the model accuracy metrics are scored. 4. Train final modelFinally, we use the optimal parameters to predict new data. 1. Prepare 2. Optimization 3. Learn Models, do prediction and scoring in Parameter Optimization Loop 4. Train final model Starting a local H2O Cloud.Import train data in H2OPartition the datainto training (70%) and test (30%)Load training datasetCombine with other KNIMENodes as controllersCollect resultsJoin iteration settings with Accuracy statisticsPrepare optimization gridfor cross join Sort by metricsExtract iteration with best metricsOptimal parameters as variableLoad new data(Not available in training/evaluation)Import new data in H2OGBM Learnerwith parameters controlledby Parameter OptimizationLoop variables.Predict probabilitiesComputescores periterationTrain GBM on whole train data with optimal settingsPredict new data H2O Local Context Table to H2O H2O Partitioning File Reader Parameter OptimizationLoop Start Loop End Cross Joiner Variable toTable Row Sorter Row Filter Table Rowto Variable File Reader Table to H2O H2O Gradient BoostingMachine Learner H2O Predictor(Classification) H2O Binomial Scorer H2O Gradient BoostingMachine Learner H2O Predictor(Classification) Parameter Optimization with H2OThis tutorial shows how to train multiple H2O Models in KNIME using parameter optimization (grid search) and extract the optimal algorithm settings for the training ofthe final model. We will train Gradient Boosting Machines for binominal classification using a grid of two different GBM parameters.1. Prepare:Load and Import data to H2O.2. Optimization:To train models with parameter optimization, we create a Loop using the KNIME Node "Parameter Optimization Loop Start" (Analytics - Mining). In this Nodes' settingswe can define the optimization grid: For this example we will optimize the GBM algorithm parameters "Number of trees" and "Max tree depth". We use brute forceoptimization, meaning that there will be as many iteration as there are parameter combinations defined in the Parameter Optimization Loop Start Node. The "LoopEnd" Node collects the scored metrics of all optimization loop iterations. In order to extract the optimal algorithm parameters, we sort the collected rows by severalmetrics and filter the top row. 3. Learn Models, do prediction and scoring in Parameter Optimization Loop:For each combination of parameters, a GBM Model is build by H2O using the "Number of Trees" and "Max tree depth" parameters of the corresponding loop iterationand the model accuracy metrics are scored. 4. Train final modelFinally, we use the optimal parameters to predict new data. 1. Prepare 2. Optimization 3. Learn Models, do prediction and scoring in Parameter Optimization Loop 4. Train final model Starting a local H2O Cloud.Import train data in H2OPartition the datainto training (70%) and test (30%)Load training datasetCombine with other KNIMENodes as controllersCollect resultsJoin iteration settings with Accuracy statisticsPrepare optimization gridfor cross join Sort by metricsExtract iteration with best metricsOptimal parameters as variableLoad new data(Not available in training/evaluation)Import new data in H2OGBM Learnerwith parameters controlledby Parameter OptimizationLoop variables.Predict probabilitiesComputescores periterationTrain GBM on whole train data with optimal settingsPredict new data H2O Local Context Table to H2O H2O Partitioning File Reader Parameter OptimizationLoop Start Loop End Cross Joiner Variable toTable Row Sorter Row Filter Table Rowto Variable File Reader Table to H2O H2O Gradient BoostingMachine Learner H2O Predictor(Classification) H2O Binomial Scorer H2O Gradient BoostingMachine Learner H2O Predictor(Classification)

Download

Get this workflow from the following link: Download

Resources

Nodes

06_​H2O_​GBM_​parameter_​optimization consists of the following 18 nodes(s):

Plugins

06_​H2O_​GBM_​parameter_​optimization contains nodes provided by the following 4 plugin(s):