DB Partitioning

This node split rows from a DB Data table. The dialog enables you to specify the number of rows to split and the splitting strategy.

The created partitions might overlap depending on the database and the selected sampling option. KNIME is not storing the data of the first partition in any way but is executing the query that represents the first partition also as part of the query to retrieve the second partition. If the query that defines the first partition returns a different result for each execution the two partitions might overlap. This is most likely the case for random sampling without a fixed seed.

Options

Absolute
Specify the absolute number of rows in the sample. If there are less rows than specified, all rows are used.
Relative
The percentage of the number of rows in the DB Data table to extract. Must be between 0 and 100, inclusively.
Take from top
This mode selects the top most rows of the table. Note that the order of the rows depends on the connected database.
Draw randomly
Sample rows in random order if the connected database supports random sampling.
Note that this method might be very slow for large database tables.
Stratified sampling
Check this button if you want stratified sampling, i.e. the distribution of values in the selected column is (approximately) retained in the output table.
Random seed
Check this option if you want to provide a seed number for random sampling.
The two partitions most likely will overlap if you do not provide a fixed seed for random sampling.

Input Ports

Icon
DB Data to apply database sampling.

Output Ports

Icon
First DB Data partition with sampled rows.
Icon
Second DB Data partition with sampled rows.

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.