H2O AutoML Learner (Regression)

Learns the specified types of models using H2O AutoML and returns the leading model amongst these. As part of the learning process, hyperparameters are automatically optimized by H2O using a random grid search.

Options

General Settings

Target Column: Select target column. The column must contain numerical values.
Column selection: Select columns used for model training.
Max. runtime in seconds: Select to specify the maximum runtime for AutoML learning in seconds. Note that this setting can affect AutoML reproducibility (max_runtime_secs) .
Max number of models: Select to specify the maximum number of models that should be trained, excluding Stacked Ensemble models (max_models) .
Use static random seed: Select to use static seed for randomization.

Algorithm Settings

Scoring metric used to select best model: Select the metric used to sort the leaderboard at the end of an AutoML run. The leading model according to the metric will be returned (sort_metric) .
Include algorithms: Select the algorithms that should be included in the AutoML run. Note that Deep Learning means a multi-layer feedforward artificial neural network. If Stacked Ensemble is checked, a second-level model is learned that stacks/combines the learned and optimized models. Hence, Stacked Ensemble can only be included if at least one other model type is included (include_algos) .
Number of folds: The number of folds that should be used for k-fold cross-validation of the models in the AutoML run (nfolds) .
Use fold column: Select to specify a column with cross-validation fold index assignment per observation. The column must not be the same column as the target column and must either contain integer or nominal values (fold_column) .

Advanced Settings

Early Stopping: Select to activate early stopping.
Stopping metric: Specify the metric to use for early stopping. The metric is calculated on the cross-validation folds (stopping_metric) .
Stopping tolerance: Specify the relative tolerance for the metric-based stopping to stop training if the improvement is less than this value (stopping_tolerance) .
Number of last seen rows for moving average: Stops training when the option selected for stopping_metric doesn’t improve for the specified number of training rounds, based on a simple moving average. The metric is calculated on the cross-validation folds (stopping_rounds) .
Max. runtime in seconds per model: Specify the maximum amount of time dedicated to the training of each individual model in the AutoML run. This setting can affect AutoML reproducibility (max_runtime_secs_per_model) .
Weight column (optional): Select a column to use for the observation weights which are used for bias correction. Note that this setting can affect AutoML reproducibility slightly (weights_column) .

Input Ports

: H2O Frame with training data.

Output Ports

: The best H2O model trained in the AutoML process based on the selected scoring metric. The leading model corresponds with the first row of the leaderboard table.
: A leaderboard of models trained in the AutoML process. The models are ranked by the selected scoring metric, i.e., the model of the first row is the one that is output.

Popular Predecessors

Popular Successors

Views

This node has no views

Workflows

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension KNIME H2O Machine Learning Integration from the below update site following our NodePit Product and Node Installation Guide:

v5.6

A zipped version of the software site can be downloaded here.

Plugin provider: KNIME AG, Zurich, Switzerland

Plugin version: 5.6.0.v202507151410

On NodePit since: 2025-08-15

Last update: 2025-08-15

KNIME versions: Since v4.4

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!