How to Sample Data
This workflow demonstrates several ways to create smaller, more manageable versions of a large dataset. Each branch uses a different sampling technique to select a subset of rows: random sampling picks rows at random, linear sampling takes the first set number of rows, stratified sampling ensures each group (like a class or category) is represented proportionally, and equal size sampling creates samples with exactly the same number of rows from each group. These approaches help you test models or analyze data efficiently without using the entire dataset.
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!