KNIME Statistic Nodes version 3.7.0.v201811071020 by KNIME AG, Zurich, Switzerland
This node detects and treats the outliers for each of the selected columns individually by means of interquartile range (IQR).
To detect the outliers for a given column, the first and
third
quartile (Q_{1}, Q_{3}) is computed.
An observation is flagged an outlier if it lies
outside the range
R = [Q_{1} - k(IQR), Q_{3} + k(IQR)] with
IQR = Q_{3} - Q_{1} and k >= 0.
Setting k = 1.5 the smallest value in R corresponds,
typically, to the lower end of a boxplot's whisker and largest value
to its upper end.
Providing grouping information allows to detect outliers only
within their respective groups.
If an observation is flagged an outlier, one can either replace it by some other value or remove/retain the corresponding row.
Missing values contained in the data will be ignored, i.e., they will neither be used for the outlier computation nor will they be flagged as an outlier.
To use this node in KNIME, install KNIME Statistic Nodes from the following update site:
Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter, or chat on Gitter!
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.