Uplift Tree Learner

The Uplift Tree Learner node is capable of building decision trees for uplift modeling by using a specific split criterion. To do so, the node needs two classification columns, a contact group column and a target group column. Both columns must be binary to be accepted by this node. In the contact group column a flag indicates if for example a customer was a member of the control group or not. Similar to this the flag in the target column group logs the reaction of the customer to a specific treatment. It is good practice to use 0 and 1 as flags, in which 1 represents a customer that is member of the control group or has responded to a treatment in the target group column respectively.

Options

Control group column
Column containing a flag to identify members of the control group.
Target group column
Column containing a flag to identify if customers responded to a treatment or not.
Control group column flag
The value of the flag that marks a customer as a member of the control group.
Target group column flag
The value of the flag that indicates if a customer has responded to a treatment.
Define split criterion
Selects the split criterion the Uplift Tree is built upon. Only binary splits are supported.
Min number records per node
The minimum number of records per node. Works as a stopping criterion.
Skip nominal columns without domain information
If no domain information for nominal attributes are present in the input table, these attributes are ignored during the building of the Uplift Tree. Empty domains for nominal values are the result of a great number of unique values. If a nominal column has more than 63 unique values, KNIME does not create domain information for this column.
Max #nominal
The subsets for the binary nominal splits are difficult to calculate. To find the best subsets for n nominal values 2^n calculations must be performed. In case of many different nominal values this can be prohibitive expensive. Thus the maximum number of nominal values can be defined for which all possible subsets are calculated. Above this threshold, a heuristic is applied that first calculates the best nominal value for the second partition, then the second best value, and so on; until no improvement can be achieved.

Input Ports

Icon
The pre-classified data that should be used to induce the Uplift Tree. At least two attributes must be nominal.

Output Ports

Icon
The induced Uplift Tree. The model can be used to classify data with unknown target (class) attribute. To do so, connect the model out port to the "Uplift Tree Predictor" node.

Views

Uplift Tree View
A detailed and interactive view of the Uplift Tree.
Uplift Tree View (simple)
A simple directory-like view of the Uplift Tree.

Workflows

Further Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.