KNIME Base Nodes version 3.6.1.v201808311359 by KNIME AG, Zurich, Switzerland
This node induces a classification decision tree in main memory.
The target attribute must be nominal. The other attributes used for
decision making can be either nominal or numerical. Numeric splits
are always binary (two outcomes), dividing the domain in two partitions at a
given split point. Nominal splits can be either binary (two outcomes) or
they can have as many outcomes as nominal values. In the
case of a binary split the nominal values are divided into two subsets.
The algorithm provides two quality measures for split calculation;
the gini index and the gain ratio. Further, there exist a
post pruning method to reduce the tree size and increase prediction
accuracy. The pruning method is based on the minimum
description length principle.
The algorithm can be run in multiple threads, and thus, exploit multiple processors or cores.
Most of the techniques used in this decision tree implementation can be found in "C4.5 Programs for machine learning", by J.R. Quinlan and in "SPRINT: A Scalable Parallel Classifier for Data Mining", by J. Shafer, R. Agrawal, M. Mehta (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.104.152&rep=rep1&type=pdf)
To use this node in KNIME, install KNIME Base Nodes from the following update site: