0 ×

XGBoost Linear Ensemble Learner (Regression)

KNIME XGBoost Integration version 4.3.1.v202101261634 by KNIME AG, Zurich, Switzerland

Learns a linear model based XGBoost model for regression. XGBoost is a popular machine learning library that is based on the ideas of boosting. Checkout the official documentation for some tutorials on how XGBoost works. Since XGBoost requires its features to be single precision floats, we automatically cast double precision values to float, which can cause problems for extreme numbers.

Options

Target column
The column containing the regression target.
Feature columns
Allows to select which columns should be used as features in training. Note that the domain of nominal features must contain the possible values otherwise the node can't be executed. Use the Domain Calculator node to calculate any missing possible value sets.
Boosting rounds
The number of models to train in the boosting ensemble.
Use static random seed
If checked, the seed displayed in the text field is used as seed for randomized operations such as sampling. Otherwise a new seed is generated for each node execution. Note that the Shotgun updater is always non-deterministic even if a static seed is set.
Manual number of threads
Allows to specify the number of threads to use for training. The default if the checkbox is not selected is the number of available cores.

Objective

Objective
One of
  • linear
  • logistic
  • gamma
  • poisson
  • tweedie
Tweedie regression variance
Controls the variance of the Tweedie distribution. Must be in the range (1, 2) and is by default set to 1.5.

Booster

Lambda
L2 regularization term on weights. Increasing this value will make model more conservative. Normalized to number of training examples.
Alpha
L1 regularization term on weights. Increasing this value will make model more conservative. Normalized to number of training examples.
Updater
Choice of algorithm to fit linear model
  • Shotgun: Parallel coordinate descent algorithm based on shotgun algorithm. Uses ‘hogwild’ parallelism and therefore produces a nondeterministic solution on each run no matter whether a static random seed is set.
  • CoordDescent: Ordinary coordinate descent algorithm. Also multithreaded but still produces a deterministic solution.
Feature selector
Feature selection and ordering method.
  • Cyclic: Deterministc selection by cycling through features one at a time.
  • Shuffle: Similar to cyclic but with random feature shuffling prior to update.
  • Random: Randomly (with replacement) selects coordinates.
  • Greedy: It is fully deterministic. It allows restricting the selection to top k features per group with the largest magnitude of univariate weight change, by setting the top k parameter. Doing so would reduce the complexity to O(num_feature*top k).
  • Thrifty: Thrifty, approximately-greedy feature selector. Prior to cyclic updates, reorders features in descending magnitude of their univariate weight changes. This operation is multithreaded and is a linear complexity approximation of the quadratic greedy selection. It allows restricting the selection to top_k features per group with the largest magnitude of univariate weight change, by setting the top k parameter.
Top k
The number of top feature to select in greedy and thrifty feature selector. The value of 0 corresponds to using all features.

Input Ports

Icon
The data to learn from.

Output Ports

Icon
The trained model.

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME XGBoost Integration from the following update site:

KNIME 4.3

A zipped version of the software site can be downloaded here.

You don't know what to do with this link? Read our NodePit Product and Node Installation Guide that explains you in detail how to install nodes to your KNIME Analytics Platform.

Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform. Browse NodePit from within KNIME, install nodes with just one click and share your workflows with NodePit Space.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.