Low Variance Filter

Filters out double-compatible columns, whose variance is below a user defined threshold. Columns with low variance are likely to distract certain learning algorithms (in particular those which are distance based) and are therefore better removed.

Note, the input table should not be normalized with a Gaussian normalization or any other normalization technique which changes the variances of the input.

Options

Variance Upper Bound
Choose a variance value here. The higher the value, the more columns are likely to get filtered out. Choose 0 to filter for columns, which only contain one constant value.

Column Filter

Include
This list contains the names of the columns that are considered for filtering. Any other column will be left untouched (i.e. will also be present in the output table independent of their variance).
Enforce Inclusion
Select this option to enforce the current inclusion list to stay the same even if the input table specification changes. New columns will automatically be added to the exclusion list.
Buttons
Use these buttons to move columns between the Include and Exclude list. Single-arrow buttons will move all selected columns. Double-arrow buttons will move all columns (filtering is taken into account).
Filter
Use one of these fields to filter either the Include or Exclude list for certain column names or name substrings.
Exclude
This list contains the names of the columns of the input table that will be left untouched (i.e. will also be present in the output table independent of their variance).
Enforce Exclusion
Select this option to enforce the current exclusion list to stay the same even if the input table specification changes. New columns will automatically be added to the inclusion list.

Wildcard/Regex Selection

Type a search pattern which matches columns to move into the Include or Exclude list. Which list is used can be specified. You can use either Wildcards ('?' matching any character, '*' matching a sequence of any characters) or Regex. You can specify whether your pattern should be case sensitive.

Input Ports

Icon
Numeric input data. (Non-numeric columns will be left untouched.)

Output Ports

Icon
Filtered data.

Popular Predecessors

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.