Learns a linear model based XGBoost model for classification. XGBoost is a popular machine learning library that is based on the ideas of boosting. Checkout the official documentation for some tutorials on how XGBoost works. Since XGBoost requires its features to be single precision floats, we automatically cast double precision values to float, which can cause problems for extreme numbers.

- Objective
- For binary classification tasks there exists the option to use the binary logistic or the softprob objective function, while for more than two classes only softprob is available.
- Target column
- The column containing the class variable. Note that the column domain must contain the possible values. Please use the Domain Calculator node to calculate the possible values if they are not assigned yet.
- Weight column
- The column containing the row weights (also called sample weights or instance weights). Note that the selected column must not contain missing values.
- Feature columns
- Allows to select which columns should be used as features in training. Note that the domain of nominal features must contain the possible values otherwise the node can't be executed. Use the Domain Calculator node to calculate any missing possible value sets.
- Boosting rounds
- The number of models to train in the boosting ensemble.
- Base score
- The initial prediction score of all instances; this global bias will have little effect for a sufficiently large number of iterations.
- Use static random seed
- If checked, the seed displayed in the text field is used as seed for randomized operations such as sampling. Otherwise a new seed is generated for each node execution. Note that the Shotgun updater is always non-deterministic even if a static seed is set.
- Manual number of threads
- Allows to specify the number of threads to use for training. The default if the checkbox is not selected is the number of available cores.

- Lambda
- L2 regularization term on weights. Increasing this value will make model more conservative. Normalized to number of training examples.
- Alpha
- L1 regularization term on weights. Increasing this value will make model more conservative. Normalized to number of training examples.
- Updater
- Choice of algorithm to fit linear model
- Shotgun: Parallel coordinate descent algorithm based on shotgun algorithm. Uses ‘hogwild’ parallelism and therefore produces a nondeterministic solution on each run no matter whether a static random seed is set.
- CoordDescent: Ordinary coordinate descent algorithm. Also multithreaded but still produces a deterministic solution.

- Feature selector
- Feature selection and ordering method.
- Cyclic: Deterministc selection by cycling through features one at a time.
- Shuffle: Similar to cyclic but with random feature shuffling prior to update.
- Random: Randomly (with replacement) selects coordinates.
- Greedy: It is fully deterministic. It allows restricting the selection to top k features per group with the largest magnitude of univariate weight change, by setting the top k parameter. Doing so would reduce the complexity to O(num_feature*top k).
- Thrifty: Thrifty, approximately-greedy feature selector. Prior to cyclic updates, reorders features in descending magnitude of their univariate weight changes. This operation is multithreaded and is a linear complexity approximation of the quadratic greedy selection. It allows restricting the selection to top_k features per group with the largest magnitude of univariate weight change, by setting the top k parameter.

- Top k
- The number of top feature to select in greedy and thrifty feature selector. The value of 0 corresponds to using all features.

- This node has no views

- 00_Guided_Labeling_for_Document_Classification_(legacy)KNIME Hub
- 01_Guided_Labeling_for_Document_ClassificationKNIME Hub
- 02_Active_Learning_for_Document_ClassificationKNIME Hub
- E - Commerce Customer ChurnKNIME Hub
- Journal2_Perfomance-InjuriesKNIME Hub
- Show all 16 workflows

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

To use this node in KNIME, install the extension KNIME XGBoost Integration from the below update site following our NodePit Product and Node Installation Guide:

v5.2

A zipped version of the software site can be downloaded here.

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud
or on-premises – with our brand new **NodePit Runner**.

Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter or botsin.space/@nodepit on Mastodon.

**Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.**