Icon

04_​H2O_​Crossvalidation

H2O Cross-Validation

This workflow shows how to use cross-validation in H2O using the KNIME H2O Nodes. In the example we use the H2O Random Forest to predict the multiclass response of the IRIS data set using 5-folds and evaluate the cross-validated performance.

1. Prepare:
Importing the IRIS data to H2O.

2. Cross Validation:
In order to do Cross Validation using the KNIME H2O Nodes, we use the "H2O Cross Validation Loop Start" Node and configure it for 5-fold Cross Validation using stratified fold assignment. The upper output Port contains the training data and the lower output port the test data.

3. Learn Models in Cross Validation Loop:
For each CV-fold, a Random Forest with 50 trees of maximum depth 15 is build by H2O using the training data of the corresponding fold. The test data of the fold is then predicted, adding the class specific probabilities of class membership (needed for multinominal scoring) and scored by the H2O Multinominal Scorer Node.

4. Score
To evaluate the overall performance of all trained random forests, we use the "GroupBy" Node to compute the average performance like Accuracy, LogLoss, and more.

1. Prepare Crossvalidation with H2O This tutorial shows how to train H2O Models in KNIMEusing Cross-Validation. We will train 5 Random Forestsfor Classification using 5-folds to predict the reponseclass using the IRIS dataset and evaluate theperformance. 2. H2O Cross Validation 4. Score CV Results 3. Learn Models, predict test data and do scoring in Cross Validation Loop Import Table to H2O FrameH2O single nodeinstanceLoad IRIS dataCollectSome CV statistics5-fold CV LoopRandom Foreston 50 trees, depth 15,no early stoppingPredictSome scoring Table to H2O H2O Local Context Table Reader Loop End GroupBy H2O Cross ValidationLoop Start H2O RandomForest Learner H2O Predictor(Classification) H2O MultinomialScorer 1. Prepare Crossvalidation with H2O This tutorial shows how to train H2O Models in KNIMEusing Cross-Validation. We will train 5 Random Forestsfor Classification using 5-folds to predict the reponseclass using the IRIS dataset and evaluate theperformance. 2. H2O Cross Validation 4. Score CV Results 3. Learn Models, predict test data and do scoring in Cross Validation Loop Import Table to H2O FrameH2O single nodeinstanceLoad IRIS dataCollectSome CV statistics5-fold CV LoopRandom Foreston 50 trees, depth 15,no early stoppingPredictSome scoringTable to H2O H2O Local Context Table Reader Loop End GroupBy H2O Cross ValidationLoop Start H2O RandomForest Learner H2O Predictor(Classification) H2O MultinomialScorer

Nodes

Extensions

Links