H2O Generalized Linear Model Learner (Regression)

Learns a Generalized Linear Model (GLM) regression model using H2O .

Options

General Settings

Target Column: Select target column. Must be numeric for regression problems.
Column selection: Select columns used for model training.
Ignore constant columns: Select to ignore constant columns.
Use static random seed: Select to use static seed for randomization.

Algorithm Settings

Family: Specify the model type (family) .
Link: Specify a link function (Family_Default, Identity, Logit, Log, Inverse, and Tweedie). The available link functions depend on the selected family (link) .
Solver: Specify the solver to use (AUTO, IRLSM, L_BFGS, COORDINATE_DESCENT_NAIVE, or COORDINATE_DESCENT). The available solvers depend on the selected family. IRLSM is fast on problems with a small number of predictors and for lambda search with L1 penalty, while L_BFGS scales better for datasets with many columns. COORDINATE_DESCENT is IRLSM with the covariance updates version of cyclical coordinate descent in the innermost loop. COORDINATE_DESCENT_NAIVE is IRLSM with the naive updates version of cyclical coordinate descent in the innermost loop (solver) .
Set alpha: If enabled, specify the regularization distribution between L1 and L2. If disabled, H2O determines a default value using a heuristic (alpha) .
Set lambda: If enabled, specify the regularization strength. If disabled, H2O determines a default value using a heuristic (lambda) .
Enable lambda search: Specify whether to enable lambda search. The search will start with the highest lambda value (highest lambda value which makes sense - i.e. lowest value driving all coefficients to zero) and then keep decreasing it each step on log scale until the minimum lambda is reached. The minimum lambda will automatically be calculated if no lambda minimum ratio is defined. The number of lambdas will also be defined by a heuristic if undefined. The resulting model uses the "best" lambda value which has been evaluated on the validation set (its size can be defined in the Advanced Settings tab) (lambda_search) .
More detailed information about the process can also be found here .
Set number of lambdas: (Applicable only if lambda_search is enabled) If enabled, specify the number of lambdas to use in the search. If disabled, H2O determines a default value using a heuristic (nlambdas) .
Set lambda minimum ratio: (Applicable only if lambda_search is enabled) If enabled, specify the minimum lambda to use for lambda search (specified as a ratio of lambda_max). If disabled, H2O determines a default value using a heuristic (lambda_min_ratio) .
Set beta epsilon: If enabled, specify the beta epsilon value for convergence. If the L1 normalization of the current beta change is below this threshold, the model is converged. If disabled, H2O determines a default value using a heuristic (beta_epsilon) .
Set objective epsilon: If enabled, specify a threshold for convergence. If the objective value is less than this threshold, the model is converged. If disabled, H2O determines a default value using a heuristic (objective_epsilon) .
Set gradient epsilon: (For L-BFGS only) If enabled, specify a threshold for convergence. If the objective value (using the L-infinity norm) is less than this threshold, the model is converged. If disabled, H2O determines a default value using a heuristic (gradient_epsilon) .
Tweedie variance power: (Only applicable if Tweedie is specified for Family) Specify the Tweedie variance power (tweedie_variance_power) .
Tweedie link power: (Only applicable if Tweedie is specified for Family) Specify the Tweedie link power (tweedie_link_power)
Non negative coefficients: Specify whether to force coefficients to have non-negative values (non_negative) .
Set maximum iterations: If enabled, specify the number of training iterations. If disabled, the number of iterations is not limited (max_iterations) .
Include a constant term in the model: Specify whether to include a constant term in the model. This option is enabled by default (intercept) .
Set maximum active predictors: If enabled, specify the maximum number of active predictors during computation. This value is used as a stopping criterion to prevent expensive model building with many predictors. If disabled, H2O determines a default value using a heuristic (max_active_predictors) .
Remove collinear columns: (Only applicable if IRLSM is specified for Solver and lambda=0) Specify whether to automatically remove collinear columns during model-building. When enabled, collinear columns will be dropped from the model and will have 0 coefficient in the returned model (remove_colinear_columns) .
Standardize numeric columns: Specify whether to standardize the numeric columns to have a mean of zero and unit variance (recommended) (standardize) .
Missing values handling: Specify how to handle missing values (Skip or MeanImputation) (missing_values_handling) .

Advanced Settings

Size of validation set (in %): Specify the size of the validation dataset used to evaluate early stopping and lambda search. The option can only be specified if either early stopping or lambda search is enabled.
Early Stopping: Select to activate early stopping. The defined validation set will be used to evaluate criteria for early stopping (early_stopping) .
Max runtime in seconds: Maximum allowed runtime in seconds for model training (max_runtime_secs) .
Weights column (optional): Select a column to use for the observation weights which are used for bias correction (weights_column) .
Offset column (optional): Specify a column to use as the offset. Note: Offsets are per-row “bias values” that are used during model training. (offset_column) .

Input Ports

: H2O Frame with training data.

Output Ports

: H2O Generalized Linear Model regression model.
: Coefficients of the resulting model.

Popular Predecessors

Popular Successors

Views

This node has no views

Workflows

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension KNIME H2O Machine Learning Integration from the below update site following our NodePit Product and Node Installation Guide:

v5.5

A zipped version of the software site can be downloaded here.

Plugin provider: KNIME AG, Zurich, Switzerland

Plugin version: 5.5.0.v202504171027

On NodePit since: 2025-07-02

Last update: 2025-07-25

KNIME versions: Since v4.0

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!