Association Rule Learner

The association rule learner* searches for frequent itemsets meeting the user-defined minimum support criterion and, optionally, creates association rules from them. The column containing the transactions (BitVectors or Collections) has to be selected. The minimum support as an absolute number must be provided (therefore check the number of transactions to obtain a sensible criterion). If the frequent itemsets should be free (unconstrained) or closed or maximal has also be defined. Closed itemsets are frequent itemsets, which have no superset with the same support, thus providing all the information from free itemsets in a compressed form. Maximal itemsets are sets which have no frequent superset at all. The maximal itemset length must also be defined. If association rules are generated, a confidence value has to be provided. The confidence is a value to define how often the rule is right. Association rules generated here are in the form to have only one item in the consequence. The underlying data structure used by the algorithm can be either an ARRAY or a TIDList. Choose the former when there are many transactions an less items, and the latter if the structure of the input data is vice versa.

(*) RULE LEARNER is a registered trademark of Minitab, LLC and is used with Minitab’s permission.

Options

Column containing transactions
Select the column containing the transactions (BitVector or Collection) to mine for frequent itemsets or association rules. There must be at least one, since this is the only valid input for the subgroup miner.
Minimum support (0-1)
An itemset is considered to be frequent if there are at least "minimum support" transactions, where the itemset occurs. Make sure, to have here a meaningful number in proportion of the number of rows of the input.
Underlying data structure
Either ARRAY or TIDList: ARRAY is recommended when the number of transactions (rows) is larger than the number of items, and the TIDList if the number of rows is small and the number of items large. In general, the ARRAY option needs more memory and is faster, whereas the TIDList need less memory but is slower.
Itemset type
Choose either free, closed or maximal. Free are mostly redundant, closed provide the most information and maximal may hide some information.
Maximal itemset length
The maximal length of the resulting itemsets. A lower value may reduce the runtime if there are very long frequent itemsets.
Output association rules
Check if association rules should be generated out of the frequent itemsets. Note: association rules are always generated from free frequent itemsets and are constrained to have only one item in the consequence.
Minimum confidence
The confidence is a measure for "how often the rule is right". Thus, how often, if the items in the antecedence appeared also the consequence occurred in the transactions.

Input Ports

Icon
Datatable containing transactions.

Output Ports

Icon
Datatable with discovered frequent itemsets or association rules.

Popular Successors

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.