Learns a Generalized Linear Model (GLM) classification model using H2O .

- Target column selection
- Select target column. Must be nominal for classification problems.
- Column selection
- Select columns used for model training.
- Ignore constant columns
- Select to ignore constant columns.
- Use static random seed
- Select to use static seed for randomization.

- Solver
- Specify the solver to use (AUTO, IRLSM, L_BFGS, COORDINATE_DESCENT_NAIVE, or COORDINATE_DESCENT). IRLSM is fast on problems with a small number of predictors and for lambda search with L1 penalty, while L_BFGS scales better for datasets with many columns. COORDINATE_DESCENT is IRLSM with the covariance updates version of cyclical coordinate descent in the innermost loop. COORDINATE_DESCENT_NAIVE is IRLSM with the naive updates version of cyclical coordinate descent in the innermost loop. COORDINATE_DESCENT_NAIVE and COORDINATE_DESCENT are currently experimental (solver) .
- Family
- Specify the model type (family) .
- Link
- Specify a link function (Identity, Family_Default, Logit, Log, Inverse, or Tweedie) (link) .
- Alpha
- Specify the regularization distribution between L1 and L2 (alpha) .
- Lambda
- Specify the regularization strength (lambda) .
- Enable Lambda search
- Specify whether to enable lambda search, starting with lambda max. If you also specify a value for lambda_min_ratio, then this value is interpreted as lambda min. If you do not specify a value for lambda_min_ratio, then GLM will calculate the minimum lambda (lambda_search) .
- Number of Lambdas
- (Applicable only if lambda_search is enabled) Specify the number of lambdas to use in the search. The default is 100. (nlambdas) .
- Lambda minimum ratio
- Specify the minimum lambda to use for lambda search (specified as a ratio of lambda_max) (lambda_min_ratio) .
- Beta epsilon
- Specify the beta epsilon value. If the L1 normalization of the current beta change is below this threshold, consider using convergence (beta_epsilon) .
- Objective epsilon
- Specify a threshold for convergence. If the objective value is less than this threshold, the model is converged (objective_epsilon) .
- Gradient epsilon
- (For L-BFGS only) Specify a threshold for convergence. If the objective value (using the L-infinity norm) is less than this threshold, the model is converged (gradient_epsilon) .
- Non negative?
- Specify whether to force coefficients to have non-negative values (non_negative) .
- Max iterations
- Specify the number of training iterations (max_iterations) .
- Include a constant term in the model
- Specify whether to include a constant term in the model. This option is enabled by default (intercept) .
- Maximum active predictors
- Specify the maximum number of active predictors during computation. This value is used as a stopping criterium to prevent expensive model building with many predictors (max_active_predictors) .
- Compute P values
- Request computation of p-values. Only applicable with no penalty (lambda = 0 and no beta constraints). Setting remove_collinear_columns is recommended. H2O will return an error if p-values are requested and there are collinear columns and remove_collinear_columns flag is not enabled (compute_p_values) .
- Remove collinear columns
- Specify whether to automatically remove collinear columns during model-building. When enabled, collinear columns will be dropped from the model and will have 0 coefficient in the returned model. This can only be set if there is no regularization (lambda=0) (remove_colinear_columns) .
- Standardize numeric columns
- Specify whether to standardize the numeric columns to have a mean of zero and unit variance (recommended). (standardize) .
- Missing values handling
- Specify how to handle missing values (Skip or MeanImputation) (missing_values_handling) .

- Weight column selection
- Select a column to use for the observation weights, which are used for bias correction (weights_column) .
- Max Runtime?
- Maximum allowed runtime in seconds for model training (max_runtime_secs) .
- Early Stopping
- Select to activate early stopping.
- Stopping metric
- Specify the metric to use for early stopping (stopping_metric) .
- Stopping tolerance
- Specify the relative tolerance for the metric-based stopping to stop training if the improvement is less than this value (stopping_tolerance) .
- Number of last seen rows for moving average
- Stops training when the option selected for stopping_metric doesn’t improve for the specified number of training rounds, based on a simple moving average. To disable this feature, specify 0. The metric is computed on the validation data (if provided); otherwise, training data is used (stopping_rounds) .
- Size of validation set (in %)
- Specify the size of the validation data-set used to evaluate early stopping criteria.

- This node has no views

- No links available

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

To use this node in KNIME, install the extension KNIME H2O Machine Learning Integration from the below update site following our NodePit Product and Node Installation Guide:

v4.6

A zipped version of the software site can be downloaded here.

Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter, or chat on Gitter!

**Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.**