0 ×

Correlation Filter

KNIME Base Nodes version 4.0.1.v201908131444 by KNIME AG, Zurich, Switzerland

This node uses the model as generated by a Correlation node to determine which columns are redundant (i.e. correlated) and filters them out. The output table will contain the reduced set of columns.

The filtering step works roughly as follows: For each column in the correlation model the count of correlated columns is determined given a threshold value for the correlation coefficient (specified in the dialog). The column with the most correlated columns is chosen to "survive" and all correlated columns are filtered out. This procedure is repeated until no more columns can be identified. The problem of finding a minimum set of columns to satisfy the constraints is difficult to solve analytically. This method applied here is known to be good approximation, however.

Options

Columns from Model
Displays the set of columns for which the model has information. These columns must also be present in the input data table. The (automatically) selected elements in the list will be present in the output table. This list can not be edited.
Correlation Threshold
Choose the correlation threshold here. The higher the value the fewer columns get filtered out. Hit Enter or click the "Calculate" to see a preview of the filtered columns. The counts of included vs. excluded columns are shown in the label.
Calculate
Click this button to update the statistics. It will determine the reduced set of columns using the procedure outlined above.

Input Ports

The model from the correlation node.
Numeric input data to filter. It must contain the set of columns that were used to create the correlation model. (Typically you connect the input data from the correlation node here.)

Output Ports

Filtered data from input.

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME Base Nodes from the following update site:

KNIME 4.0
Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.