Binning Apply

The incoming data is grouped based on the node settings for grouping. Within each group and for each column of the incoming binning model, data points are counted for each bin. Further, data points with values below the lowest bin get counted as well as data points with values above the highest bin. Missing values will no be counted.
The first output table contains these counts and the corresponding percentage within the group. The counts of the lowest interval will include (!) the counts below; and the counts of the highest interval will include the counts above. So the sum reflects the number of data points of the group and percentages sum up to 100%.
The second output table only contains counts of data points with values below the lowest or above the highest interval. The percentage is based on the data point count.

Options

General settings

Group columns
Included columns will be used to group the data.
Exclude incomplete binning models

Models with less than n bins will be excluded. Number of expected bins is part of the binning model.
default: checked

Ignore missing columns

If checked only a warning will be shown if incoming data table does miss columns which are contained in the binning model. If unchecked, the node will not be executable if any column is missing.
default: checked

Input is already sorted by group column(s)

If checked, the data will not be sorted during execution (faster). The node will fail if the pre-sorting is not correct.
default: unchecked

Sampling options

Enable sampling

If checked, each group will be reduced (randomly) to a fixed number of rows, which can be set. If the group does contain less rows, all will be used.
default: 100

Use random seed

If checked, the random selection of rows for the sampling will be based on a fixed seed to make it reproducible.

Input Ports

Icon
Data to apply the binning model to
Icon
Binning model

Output Ports

Icon
Row counts per group, parameter and interval
(including outlier count within the lowest/highest interval)
Icon
Row counts per group, parameter and interval of datapoints below lowest or above highest interval

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.