Statistics

This node calculates statistical moments such as minimum, maximum, mean, standard deviation, variance, median, overall sum, number of missing values and row count across all numeric columns, and counts all nominal values together with their occurrences. The dialog offers two options for choosing the median and/or nominal values calculations:

Options

Settings

Compute median values
Select this option if for all numeric columns the medians are computed. Note, this computation might be expensive, since it requires to sort all column independently to find the values that divides the distribution into two halves of the same number of values.
Column filter
Filter columns for counting all possible values.
Nominal values
Adjusts the number of counts for both, top number of frequent and infrequent occurrences of categorical values per column (displayed in the node view!).
Nominal values in output
Adjusts the maximum number of possible values per column in the nominal output table.

Histogram

Histogram format
The histogram cells should be in SVG or PNG format.
Width
The width of the histogram.
Height
The height of the histogram.
Show min/max values
Show or do not show the numeric min/max values on histograms.

Input Ports

Icon
Table from which to compute statistics.

Output Ports

Icon
Table with numeric values.
Icon
Table with all nominal value histograms.
Icon
Table with all nominal values and their counts.

Popular Successors

Views

Statistics View
Displays all statistic moments (for all numeric columns), nominal values (for all selected categorical columns) and the most frequent/infrequent values from the categorical columns (Top/bottom).

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.