This node is currently not available in KNIME v5.12 — instead we’re showing this page for KNIME v5.11. You can use the version menu in the title bar to permanently switch your preferred version. This will also show the link to the update site.

Decision Tree Learner

This Node Is Deprecated — This version of the node has been replaced with a new and improved version. The old version is kept for backwards-compatibility, but for all new workflows we suggest to use the version linked below.

This node induces a classification decision tree in main memory. The target attribute must be nominal. The other attributes used for decision making can be either nominal or numerical. Numeric splits are always binary (two outcomes), dividing the domain in two partitions at a given split point. Nominal splits can be either binary (two outcomes) or they can have as many outcomes as nominal values. In the case of a binary split the nominal values are divided into two subsets. The algorithm provides two quality measures for split calculation; the gini index and the gain ratio. Further, there exist a post pruning method to reduce the tree size and increase prediction accuracy. The pruning method is based on the minimum description length principle.
The algorithm can be run in multiple threads, and thus, exploit multiple processors or cores.
Most of the techniques used in this decision tree implementation can be found in "C4.5 Programs for machine learning", by J.R. Quinlan and in "SPRINT: A Scalable Parallel Classifier for Data Mining", by J. Shafer, R. Agrawal, M. Mehta (https://www.vldb.org/conf/1996/P544.PDF)
If the optional PMML inport is connected and contains preprocessing operations in the TransformationDictionary those are added to the learned model.

Options

Class column: To select the target attribute. Only nominal attributes are allowed
Quality measure: To select the quality measure according to which the split is calculated. Available are the "Gini Index" and the "Gain Ratio".
Pruning method: Pruning reduces tree size and avoids overfitting which increases the generalization performance, and thus, the prediction quality (for predictions, use the "Decision Tree Predictor" node). Available is the "Minimal Description Length" (MDL) pruning or it can also be switched off.
Min number records per node: To select the minimum number of records at least required in each node. If the number of records is smaller or equal to this number the tree is not grown any further. This corresponds to a stopping criteria (pre pruning).
Number records to store for view: To select the number of records stored in the tree for the view. The records are necessary to enable highlighting.
Average split point: If checked, the split value for numeric attributes is determined according to the mean value of the two attribute values that separate the two partitions. If unchecked the split value is set to the largest value of the lower partition (like C4.5).
Number threads: This node can exploit multiple threads and thus multiple processors or cores. This can improve performance. The default value is set to the number of processors or cores that are available to KNIME. If set to 1, the algorithm is performed sequentially.
Skip nominal columns without domain information: If checked, nominal columns containing no domain value information are skipped. This is generally the case for nominal columns that have too many different values.
Binary nominal splits: If checked, nominal attributes are split in a binary fashion. Binary splits are more difficult to calculate but result also in more accurate trees. The nominal values are divided in two subsets (one for each child). If unchecked, for each nominal value one child is created.
Max #nominal: The subsets for the binary nominal splits are difficult to calculate. To find the best subsets for n nominal values there must be performed 2^n calculations. In case of many different nominal values this can be prohibitive expensive. Thus the maximum number of nominal values can be defined for which all possible subsets are calculated. Above this threshold, a heuristic is applied that first calculates the best nominal value for the second partition, then the second best value, and so on; until no improvement can be achieved.
Filter invalid attribute values in child nodes: Binary splits on nominal values may lead to tests for attribute values, which have been filtered out by a parent tree node. This is due to the fact that the learning algorithm is consistently using the table's domain information instead of the data in a tree node to define the split sets. These duplicate checks do not harm (the tree is the same and and will classify unknown data the exact same way), though they are confusing when the tree is inspected in the tree viewer. Enabling this option will post-process the tree and filter invalid checks.

Input Ports

: The pre-classified data that should be used to induce the decision tree. At least one attribute must be nominal.
: Optional PMML port object containing preprocessing operations.

Output Ports

: The induced decision tree. The model can be used to classify data with unknown target (class) attribute. To do so, connect the model out port to the "Decision Tree Predictor" node.

Popular Predecessors

Popular Successors

No recommendations found

Views

Decision Tree View: Visualizes the learned decision tree. The tree can be expanded and collapsed with the plus/minus signs.
Decision Tree View (simple): Visualizes the learned decision tree. The tree can be expanded and collapsed with the plus/minus signs. The squared brackets show the splitting criteria. This is the attribute name on which the parent node was split and the value (numeric) and nominal value (set) that has led to this child. The class value in single quotes states the majority class in this node. The value in round brackets states (x of y) where x is the quantity of the majority class and y is the total count of examples in this node. The bar with the black border and partly filled with yellow color represents the amount of records that belongs to the node relatively to the parent node. The colored pie chart renders the distribution of the color attribute associated with the input data table. NOTE: the colors not necessarily reflect the class attribute. If the color distribution and the target attribute should correspond to each other, ensure that the "Color Manager" node colors the same attribute as selected in this decision tree node as target attribute.

Workflows

No workflows found

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension KNIME Base nodes from the below update site following our NodePit Product and Node Installation Guide:

v5.11

A zipped version of the software site can be downloaded here.

Plugin provider: KNIME AG, Zurich, Switzerland

Plugin version: 5.11.0.v202602201626

On NodePit since: 2026-03-10

Last update: 2026-06-15

Tags: Deprecated

KNIME versions: From v4.2 to v5.11

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!