DB Row Sampling

This node extracts a sample from the input data. The dialog enables you to specify the sample size and the sampling strategy.

Options

Absolute
Specify the absolute number of rows in the sample. If there are less rows than specified here, all rows are used.
Relative
The percentage of the number of rows in the database table to extract. Must be between 0 and 100, inclusively.
Take from top
This mode selects the top most rows of the table. Note that the order of the rows depends on the connected database.
Draw randomly
Sample rows in random order if the connected database supports random sampling. Note that this method might be very slow for large database tables.
Stratified sampling
Check this button if you want stratified sampling, i.e. the distribution of values in the selected column is (approximately) retained in the output table.
Random seed
If either random or stratified sampling is selected, you may enter a fixed seed in order to get reproducible results upon re-execution. This option is disabled if the database does not support a seed per query execution. Depending on the database a new random seed might be taken for each execution if not specified.

Input Ports

Icon
DB Data to apply sampling.

Output Ports

Icon
DB Data with sampled rows.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.