This node helps handle missing values found in cells of the input
table. The first tab in the dialog (labeled "Default") provides
default handling options for all columns of a given type.
These settings apply to all columns in the input table that are not
explicitly mentioned in the second tab, labeled "Individual". This
second tab permits individual settings for each available column
(thus, overriding the default). To make use of this second approach,
select a column or a list of columns which needs
extra handling, click "Add", and set the parameters. Click on the
label with the column name(s), will select all covered columns
in the column list. To remove this extra handling (and instead use
the default handling), click the "Remove" button for this column.
Options marked with an asterisk (*) will result in non-standard PMML,
which uses extensions that cannot be read by other tools than KNIME.
Average Interpolation*
This missing value handler replaces missing values with the average value of
the previous and next encountered non-missing value in the column it is configured for.
When dealing with tables that have a large number of rows but not too many columns
that need missing value replacement, the option to use disk backed statistics
avoids flooding of the main memory. This should be used with caution, as it is generally
much slower than in-memory statistics.
This missing value handler does not produce standard PMML 4.2!
Fix Value (Double)
Replaces missing values with a double given by the user.
This missing value handler produces valid PMML 4.2.
Fix Value (Integer)
Replaces missing values with an integer number given by the user.
This missing value handler produces valid PMML 4.2.
Fix Value (String)
Replaces missing values with a string given by the user.
This missing value handler produces valid PMML 4.2.
Fix Value (Long)
Replaces missing values with a long given by the user.
This missing value handler produces valid PMML 4.2.
Fix Value
No description provided.
Linear Interpolation*
This missing value handler replaces missing values with the linear interpolation
between the previous and next encountered
non-missing value in the column it is configured for.
When dealing with tables that have a large number of rows but not too many columns
that need missing value replacement, the option to use disk backed statistics
avoids flooding of the main memory. This should be used with caution, as it is generally
much slower than in-memory statistics.
This missing value handler does not produce standard PMML 4.2!
Maximum
Finds the column's largest value and replaces all missing values with it.
This missing value handler produces valid PMML 4.2.
Mean
Calculates the mean value of all non-missing cells in a column
and replaces the missing values with this mean.
This missing value handler produces valid PMML 4.2.
Median
Finds the column's median value and replaces all missing values with it.
For large tables this might be computationally expensive because the table needs
to be sorted to find the median.
This missing value handler produces valid PMML 4.2.
Minimum
Finds the column's smallest value and replaces all missing values with it.
This missing value handler produces valid PMML 4.2.
Most Frequent Value
Calculates the most frequent value in a column
and replaces the missing values with it.
This missing value handler produces valid PMML 4.2.
Moving Average*
Calculates the mean of all values that are within the window given by
the lookahead and lookbehind and replaces missing values with this mean.
This missing value handler does not produce standard PMML 4.2!
The number of cells to take into account before and after the current cell can be
set using the options lookbehind and lookahead respectively.
Next*
This missing value handler replaces missing values with the next encountered
non-missing value in the column it is configured for.
When dealing with tables that have a large number of rows but not too many columns
that need missing value replacement, the option to use disk backed statistics
avoids flooding of the main memory. This should be used with caution, as it is generally
much slower than in-memory statistics.
This missing value handler does not produce standard PMML 4.2!
Previous*
This missing value handler replaces missing values with the last encountered
non-missing value in the column it is configured for.
This missing value handler does not produce standard PMML 4.2!
Remove Row*
This missing value handler removes rows that have a missing value in the column
it is configured for.
This missing value handler does not produce standard PMML 4.2!
Rounded Mean
Calculates the mean value of all non-missing cells in a column
and replaces the missing values with this mean.
This missing value handler produces valid PMML 4.2.
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
To use this node in KNIME, install the extension KNIME Base nodes from the below update site following our NodePit Product and Node Installation Guide:
A zipped version of the software site can be downloaded here.
Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter or botsin.space/@nodepit on Mastodon.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.