This workflow is based on the adult.csv data set. Try it out to:
1. Remove duplicates
- keep the first or last appearance of the duplicates
- keep the row of duplicates that has a maximum or minimum value regarding a specific feature
2. Flag duplicates
- add a column that flags rows as unique, duplicate or chosen
- add a column that displays the RowID of the (representative) chosen row for each duplicate
- add both columns for the two flag types that were mentioned before
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!