Simple Regression Tree Learner

This Node Is Deprecated — This version of the node has been replaced with a new and improved version. The old version is kept for backwards-compatibility, but for all new workflows we suggest to use the version linked below.
Go to Suggested ReplacementSimple Regression Tree Learner

Learns a single regression tree. The procedures follows the algorithm described by "Classification and Regression Trees" (Breiman et al, 1984), whereby the current implementation applies a couple of simplifications, e.g. no pruning, missing values ignored, not necessarily binary trees, etc.

Options

Attribute Selection

Target Column
Select the column containing the value to be learned. Rows with missing values in this column will be ignored during the learning process.
Attribute Selection

Select the attributes to use learn the model. Two variants are possible.

Fingerprint attribute uses the different bit/count positions in the selected bit/byte vector as learning attributes (for instance a bit/byte vector of length 1024 is expanded to 1024 binary/count attributes). All bit/byte vectors in the selected column must have the same length.

Column attributes are nominal and numeric columns used as descriptors. Numeric columns are split in a <= fashion; nominal columns are currently split by creating child nodes for each of the values.

Ignore columns without domain information
If selected, nominal columns with no domain information are ignored (as they likely have too many possible values anyway).
Enable Hightlighting (#patterns to store)
If selected, the node stores the selected number of rows and allows highlighting them in the node view.

Tree Options

Limit number of levels (tree depth)
Number of tree levels to be learned. For instance, a value of 1 would only split the (single) root node (decision stump).
Minimum split node size
Minimum number of records in a decision tree node so that another split is attempted. Note, this option does not make any implications on the minimum number of records in a terminal node. If enabled, this number needs to be at least twice as large as the minimum child node size (as otherwise for binary splits one of the two children would have less records than specified).
Minimum child node size
Minimum number of records in child nodes. It can be at most half of the minimum split node size (see above). Note, this parameter is currently ignored for nominal splits.

Input Ports

Icon
The data to learn from. It must contain at least one numeric target column and either a fingerprint (bit-vector/byte-vector) column or another numeric or nominal column.

Output Ports

Icon
The trained model.

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.