Icon

06_​H2O_​GBM_​parameter_​optimization

H2O Parameter Optimization

This tutorial shows how to train multiple H2O Models in KNIME using parameter optimization (grid search) and extract the optimal algorithm settings for the training of the final model. We will train Gradient Boosting Machines for binominal classification using a grid of two different GBM parameters.

1. Prepare:
Load and Import data to H2O.

2. Optimization:
To train models with parameter optimization, we create a Loop using the KNIME Node "Parameter Optimization Loop Start" (Analytics - Mining). In this Nodes' settings we can define the optimization grid: For this example we will optimize the GBM algorithm parameters "Number of trees" and "Max tree depth". We use brute force optimization, meaning that there will be as many iteration as there are parameter combinations defined in the Parameter Optimization Loop Start Node. The "Loop End" Node collects the scored metrics of all optimization loop iterations. In order to extract the optimal algorithm parameters, we sort the collected rows by several metrics and filter the top row.

3. Learn Models, do prediction and scoring in Parameter Optimization Loop:
For each combination of parameters, a GBM Model is build by H2O using the "Number of Trees" and "Max tree depth" parameters of the corresponding loop iteration and the model accuracy metrics are scored.

4. Train final model
Finally, we use the optimal parameters to predict new data.

Parameter Optimization with H2OThis workflow shows how to use Parameter Optimization in combination with H2O. In the examplewe train multiple GBM models using brute force grid search and use the optimal parameters to trainthe final model. 1. Prepare 2. Optimization 3. Learn Models, do prediction and scoring in Parameter Optimization Loop 4. Train final model Starting a local H2O Cloud.Import train data in H2OPartition the datainto training (70%) and test (30%)Load training datasetCombine with other KNIMENodes as controllersCollect resultsJoin iteration settings with Accuracy statisticsPrepare optimization gridfor cross join Sort by metricsExtract iteration with best metricsOptimal parameters as variableLoad new data(Not available in training/evaluation)Import new data in H2OGBM Learnerwith parameters controlledby Parameter OptimizationLoop variables.Predict probabilitiesComputescores periterationTrain GBM on whole train data with optimal settingsPredict new data H2O Local Context Table to H2O H2O Partitioning File Reader Parameter OptimizationLoop Start Loop End Cross Joiner Variable to TableRow (deprecated) Sorter Row Filter Table Row to Variable(deprecated) File Reader Table to H2O H2O Gradient BoostingMachine Learner H2O Predictor(Classification) H2O Binomial Scorer H2O Gradient BoostingMachine Learner H2O Predictor(Classification) Parameter Optimization with H2OThis workflow shows how to use Parameter Optimization in combination with H2O. In the examplewe train multiple GBM models using brute force grid search and use the optimal parameters to trainthe final model. 1. Prepare 2. Optimization 3. Learn Models, do prediction and scoring in Parameter Optimization Loop 4. Train final model Starting a local H2O Cloud.Import train data in H2OPartition the datainto training (70%) and test (30%)Load training datasetCombine with other KNIMENodes as controllersCollect resultsJoin iteration settings with Accuracy statisticsPrepare optimization gridfor cross join Sort by metricsExtract iteration with best metricsOptimal parameters as variableLoad new data(Not available in training/evaluation)Import new data in H2OGBM Learnerwith parameters controlledby Parameter OptimizationLoop variables.Predict probabilitiesComputescores periterationTrain GBM on whole train data with optimal settingsPredict new data H2O Local Context Table to H2O H2O Partitioning File Reader Parameter OptimizationLoop Start Loop End Cross Joiner Variable to TableRow (deprecated) Sorter Row Filter Table Row to Variable(deprecated) File Reader Table to H2O H2O Gradient BoostingMachine Learner H2O Predictor(Classification) H2O Binomial Scorer H2O Gradient BoostingMachine Learner H2O Predictor(Classification)

Nodes

Extensions

Links