IconFeature Selection Loop Start (2:2)0 ×

KNIME Base Nodes version 3.6.0.v201807061308 by KNIME AG, Zurich, Switzerland

This node is the start of the feature selection loop. The feature selection loop allows you to select, from all the features in the input data set, the subset of features that is best for model construction. With this node you determine (i) which features/columns are to be held fixed in the selection process. These constant or "static" features/columns are included in each loop iteration and are exempt from elimination; (ii) which selection strategy is to be used on the other (variable) features/columns; and (iii) at which threshold number of variable features the selection process is to terminate. This node has two in and out ports. The respective first port is intended for training data and the second port for test data. The same filter is applied to both tables and they will therefore always contain the same columns.

Options

Static and Variable Features
Columns can be selected manually or by means of regular expressions. The columns in the left pane are the static columns, those in the right pane the variable columns. Since a feature selection process always has a target feature and a set of features to select from, there will always be at least one static column and more than one variable column. If you leave the left pane empty and run the node, you will get a warning. Columns can be moved from one pane to the other by clicking on the appropriate button in the middle.
Feature selection strategy
Here you can choose between two selection strategies: Forward Feature Selection and Backward Feature Elimination.
Use threshold for number of features
Check this option if you want to set a bound for the number of selected features. Since Forward Feature Selection adds features while Backward Feature Elimination subtracts them, this will be an upper bound for Forward Feature Selection and a lower bound for Backward Feature Elimination.
Select threshold for number of features
Set the upper or lower bound for the number of selected features.

Input Ports

A data table containing all features and static columns needed for the feature selection. (Trainingdata)
A data table containing all features and static columns needed for the feature selection. (Testdata)

Output Ports

The input table with some columns filtered out. (Training data)
The input table with some columns filtered out. (Test data)

Update Site

To use this node in KNIME, install KNIME Base Nodes from the following update site:

Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.