Learns a Generalized Linear Model (GLM) regression model using H2O .

- Target Column
- Select target column. Must be numeric for regression problems.
- Column selection
- Select columns used for model training.
- Ignore constant columns
- Select to ignore constant columns.
- Use static random seed
- Select to use static seed for randomization.

- Family
- Specify the model type (family) .
- Link
- Specify a link function (Family_Default, Identity, Logit, Log, Inverse, and Tweedie). The available link functions depend on the selected family (link) .
- Solver
- Specify the solver to use (AUTO, IRLSM, L_BFGS, COORDINATE_DESCENT_NAIVE, or COORDINATE_DESCENT). The available solvers depend on the selected family. IRLSM is fast on problems with a small number of predictors and for lambda search with L1 penalty, while L_BFGS scales better for datasets with many columns. COORDINATE_DESCENT is IRLSM with the covariance updates version of cyclical coordinate descent in the innermost loop. COORDINATE_DESCENT_NAIVE is IRLSM with the naive updates version of cyclical coordinate descent in the innermost loop (solver) .
- Set alpha
- If enabled, specify the regularization distribution between L1 and L2. If disabled, H2O determines a default value using a heuristic (alpha) .
- Set lambda
- If enabled, specify the regularization strength. If disabled, H2O determines a default value using a heuristic (lambda) .
- Enable lambda search
- Specify whether to enable lambda search. The search will start with
the highest lambda value (highest lambda value which makes sense -
i.e. lowest value driving all coefficients to zero) and then keep
decreasing it each step on log scale until
the minimum lambda is
reached. The
minimum lambda will automatically
be calculated if no
lambda minimum
ratio is defined. The number
of
lambdas will also be
defined by a
heuristic if undefined. The
resulting model uses the
"best" lambda
value which has been evaluated
on the validation set
(its size can be
defined in the
*Advanced Settings*tab) (lambda_search) .

More detailed information about the process can also be found here . - Set number of lambdas
- (Applicable only if lambda_search is enabled) If enabled, specify the number of lambdas to use in the search. If disabled, H2O determines a default value using a heuristic (nlambdas) .
- Set lambda minimum ratio
- (Applicable only if lambda_search is enabled) If enabled, specify the minimum lambda to use for lambda search (specified as a ratio of lambda_max). If disabled, H2O determines a default value using a heuristic (lambda_min_ratio) .
- Set beta epsilon
- If enabled, specify the beta epsilon value for convergence. If the L1 normalization of the current beta change is below this threshold, the model is converged. If disabled, H2O determines a default value using a heuristic (beta_epsilon) .
- Set objective epsilon
- If enabled, specify a threshold for convergence. If the objective value is less than this threshold, the model is converged. If disabled, H2O determines a default value using a heuristic (objective_epsilon) .
- Set gradient epsilon
- (For L-BFGS only) If enabled, specify a threshold for convergence. If the objective value (using the L-infinity norm) is less than this threshold, the model is converged. If disabled, H2O determines a default value using a heuristic (gradient_epsilon) .
- Tweedie variance power
- (Only applicable if Tweedie is specified for Family) Specify the Tweedie variance power (tweedie_variance_power) .
- Tweedie link power
- (Only applicable if Tweedie is specified for Family) Specify the Tweedie link power (tweedie_link_power)
- Non negative coefficients
- Specify whether to force coefficients to have non-negative values (non_negative) .
- Set maximum iterations
- If enabled, specify the number of training iterations. If disabled, the number of iterations is not limited (max_iterations) .
- Include a constant term in the model
- Specify whether to include a constant term in the model. This option is enabled by default (intercept) .
- Set maximum active predictors
- If enabled, specify the maximum number of active predictors during computation. This value is used as a stopping criterion to prevent expensive model building with many predictors. If disabled, H2O determines a default value using a heuristic (max_active_predictors) .
- Remove collinear columns
- (Only applicable if IRLSM is specified for Solver and lambda=0) Specify whether to automatically remove collinear columns during model-building. When enabled, collinear columns will be dropped from the model and will have 0 coefficient in the returned model (remove_colinear_columns) .
- Standardize numeric columns
- Specify whether to standardize the numeric columns to have a mean of zero and unit variance (recommended) (standardize) .
- Missing values handling
- Specify how to handle missing values (Skip or MeanImputation) (missing_values_handling) .

- Size of validation set (in %)
- Specify the size of the validation dataset used to evaluate early stopping and lambda search. The option can only be specified if either early stopping or lambda search is enabled.
- Early Stopping
- Select to activate early stopping. The defined validation set will be used to evaluate criteria for early stopping (early_stopping) .
- Max runtime in seconds
- Maximum allowed runtime in seconds for model training (max_runtime_secs) .
- Weights column (optional)
- Select a column to use for the observation weights which are used for bias correction (weights_column) .
- Offset column (optional)
- Specify a column to use as the offset. Note: Offsets are per-row “bias values” that are used during model training. (offset_column) .

- This node has no views

- 01_Compute_LIMEsKNIME Hub
- 01_Compute_LIMEsKNIME Hub
- 02 Stock Prediction - Model TrainingKNIME Hub
- 02_AutoML_Regression_and_Classification_ExamplesKNIME Hub
- 02_SHAP_and_Shapley_ValuesKNIME Hub
- Show all 52 workflows

- No links available

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

To use this node in KNIME, install the extension KNIME H2O Machine Learning Integration from the below update site following our NodePit Product and Node Installation Guide:

v5.2

A zipped version of the software site can be downloaded here.

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud
or on-premises – with our brand new **NodePit Runner**.

Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter or botsin.space/@nodepit on Mastodon.

**Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.**