Icon

Exercise3

Chapter 6 / Exercise 3

In this exercise, we clean up a file from duplicates.
The easiest way would be to use a GroupBy node, group by contract IDs and take only the first item in the group.
A more complex way would be using a group loop.
But we choose the most complex way of all, through a TableRow to Variable Loop Start just to show how the TableRow to Variable Loop works.

Workflow: Chapter6/Exercise3 In this exercise, we clean up a file from duplicates.The easiest way would be to use a GroupBy node, group by contract IDs and take onlythe first item in the group.A more complex way would be using a group loop.But we choose the most complex way of all, through a TableRow to Variable Loop Startjust to show how the TableRow to Variable Loop works. Note. Since KNIME has been thought in terms of datatable, loops are rarely needed. Before using aloop make sure that a dedicated node for whatyou have in mind does not exist! list of "contract nr"filter by "contract nr"only 1st rowmost recent load_dateto remove duplicatessort ascending on load_datecontract nr are not uniqueRowID on contract nrfailsnow check againuniqueness on"contract nr"keep only "contract nr"remove duplicates bygrouping on contract nrand using First for aggregationNode 56Wrong sales file with duplicated contract number. One row is missing the "amount" and "quantity" values.start loop on "contract nr"one by oneNode 59GroupBy Row Filter Row Filter Sorter RowID RowID Column Filter GroupBy String to Date&Time CSV Reader Table Row ToVariable Loop Start Loop End Workflow: Chapter6/Exercise3 In this exercise, we clean up a file from duplicates.The easiest way would be to use a GroupBy node, group by contract IDs and take onlythe first item in the group.A more complex way would be using a group loop.But we choose the most complex way of all, through a TableRow to Variable Loop Startjust to show how the TableRow to Variable Loop works. Note. Since KNIME has been thought in terms of datatable, loops are rarely needed. Before using aloop make sure that a dedicated node for whatyou have in mind does not exist! list of "contract nr"filter by "contract nr"only 1st rowmost recent load_dateto remove duplicatessort ascending on load_datecontract nr are not uniqueRowID on contract nrfailsnow check againuniqueness on"contract nr"keep only "contract nr"remove duplicates bygrouping on contract nrand using First for aggregationNode 56Wrong sales file with duplicated contract number. One row is missing the "amount" and "quantity" values.start loop on "contract nr"one by oneNode 59GroupBy Row Filter Row Filter Sorter RowID RowID Column Filter GroupBy String to Date&Time CSV Reader Table Row ToVariable Loop Start Loop End

Nodes

Extensions

Links