Conformal Partitioning

The input table is split into two partitions for calibration and train data (i.e. row-wise). The difference to a normal partitioning node is that the calibration set is adjusted to: (largest lower value divisible by 100) -1. The two partitions are available at the two output ports. The following options are available in the dialog:

Options

Absolute
Specify the absolute number of rows in the calibration set (before adjustment). If there are less rows than specified here, all rows are entered into the first table, while the second table contains no rows.
Relative
The percentage of the number of rows in the input table that are in the calibration set (before adjustment). It must be between 0 and 100, inclusively.
Take from top
This mode puts the top-most rows into the calibration set and the remainder in the training set.
Linear sampling
This mode always includes the first and the last row and selects the remaining rows linearly over the whole table (e.g. every third row).
Draw randomly
Random sampling of all rows, you may optionally specify a fixed seed (see below).
Stratified sampling
Check this button if you want stratified sampling, i.e. the distribution of values in the selected column is (approximately) retained in the output tables. You may optionally specify a fixed seed (see below).
Use random seed
If either random or stratified sampling is selected, you may enter a fixed seed here in order to get reproducible results upon re-execution. If you do not specify a seed, a new random seed is taken for each execution.

Input Ports

Icon
Table to partition.

Output Ports

Icon
Calibration partition (as defined in dialog and adjusted downwards to: (largest lower value divisible by 100) - 1).
Icon
Training partition (remaining rows).

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.