The first section "Treatment by Column" permits individual settings for each available column. Only columns that are not listed here are treated according to the settings in the second section "Treatment by Data Type". It is possible to select more than one column within one element leaving only treatment options that are applicable to all selected columns enabled.
The second section "Treatment by Data Type" provides default handling options for all columns of a given type. These settings apply to all columns in the input table that are not explicitly mentioned in the first section "Treatment by Column".
Treatment options marked with an asterisk (*) will result in non-standard PMML, which uses extensions that cannot be read by other tools than KNIME.
Average Interpolation*
This missing value handler replaces missing values with the average value of
the previous and next encountered non-missing value in the column it is configured for.
When dealing with tables that have a large number of rows but not too many columns
that need missing value replacement, the option to use disk backed statistics
avoids flooding of the main memory. This should be used with caution, as it is generally
much slower than in-memory statistics.
This missing value handler does not produce standard PMML 4.2!
Fix Value (Double)
Replaces missing values with a double given by the user.
This missing value handler produces valid PMML 4.2.
Fix Value (Integer)
Replaces missing values with an integer number given by the user.
This missing value handler produces valid PMML 4.2.
Fix Value (String)
Replaces missing values with a string given by the user.
This missing value handler produces valid PMML 4.2.
Fix Value (Long)
Replaces missing values with a long given by the user.
This missing value handler produces valid PMML 4.2.
Fix Value
No description provided.
Linear Interpolation*
This missing value handler replaces missing values with the linear interpolation
between the previous and next encountered
non-missing value in the column it is configured for.
When dealing with tables that have a large number of rows but not too many columns
that need missing value replacement, the option to use disk backed statistics
avoids flooding of the main memory. This should be used with caution, as it is generally
much slower than in-memory statistics.
This missing value handler does not produce standard PMML 4.2!
Maximum
Finds the column's largest value and replaces all missing values with it.
This missing value handler produces valid PMML 4.2.
Mean
Calculates the mean value of all non-missing cells in a column
and replaces the missing values with this mean.
This missing value handler produces valid PMML 4.2.
Median
Finds the column's median value and replaces all missing values with it.
For large tables this might be computationally expensive because the table needs
to be sorted to find the median.
This missing value handler produces valid PMML 4.2.
Minimum
Finds the column's smallest value and replaces all missing values with it.
This missing value handler produces valid PMML 4.2.
Most Frequent Value
Calculates the most frequent value in a column
and replaces the missing values with it.
This missing value handler produces valid PMML 4.2.
Moving Average*
Calculates the mean of all values that are within the window given by
the lookahead and lookbehind and replaces missing values with this mean.
This missing value handler does not produce standard PMML 4.2!
The number of cells to take into account before and after the current cell can be
set using the options lookbehind and lookahead respectively.
Next*
This missing value handler replaces missing values with the next encountered
non-missing value in the column it is configured for.
When dealing with tables that have a large number of rows but not too many columns
that need missing value replacement, the option to use disk backed statistics
avoids flooding of the main memory. This should be used with caution, as it is generally
much slower than in-memory statistics.
This missing value handler does not produce standard PMML 4.2!
Previous*
This missing value handler replaces missing values with the last encountered
non-missing value in the column it is configured for.
This missing value handler does not produce standard PMML 4.2!
Remove Row*
This missing value handler removes rows that have a missing value in the column
it is configured for.
This missing value handler does not produce standard PMML 4.2!
Rounded Mean
Calculates the mean value of all non-missing cells in a column
and replaces the missing values with this mean.
This missing value handler produces valid PMML 4.2.
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
To use this node in KNIME, install the extension KNIME Base nodes from the below update site following our NodePit Product and Node Installation Guide:
A zipped version of the software site can be downloaded here.
Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!