Icon

Recursively Remove Duplicates KNIME 4.7.8

Using recursive loop to remove rows that match certain rules.

On each recursive iteration, Rule 1 is tested.

If a pair of rows match the rule, they are removed.
If no rows match rule 1, then Rule 2 is tested.
Again if a pair of rows match the rule they are removed.

If there is only 1 row remaining, the recursive loop ends.
If no rows were removed in the iteration the recursive loop ends.

This is then applied by a group loop to each group of rows.

See Forum post link for more details.

@takbb Brian Bates 25 Nov 2024 KNIME 5.3.3 version
(ported to KNIME 4.7.8, 28 Nov 2024)

Row to be removed by application of Rule 1 Row to be removed by application of Rule 2 Nothing to remove If at the end of this iteration, there is either 1 row left, or the same numberof rows as we had at the start of the iteration (ie nothing removed), therows will go to the upper port (collection). Otherwise the rows will all go to the lower port (recursion) forprocessing on the next iteration APPLY RULES REMOVEMATCHING ROW RETURN ROWS OR TRYAGAIN (NEW ITERATION) ITERATEIDENTIFIERS PREPARE DATA ITERATE THEEND! Node 1Node 2YYYY-MMgroup by IDENTIFIERprocess all rowsfor the grouprecursively iterate until no further rows can be removed from the groupby the rulesascending transactiondate (ie erliest first}row keyscreate a rowkey columnbased on rowidRowIDs/Keysto be removedfrom the setrecursively process therows for a groupif only 1 row remains, send it tothe collector.Otherwise send them all back againiterate each groupGet Initial Loop CountNode 31create a dummy entry so laterjoiner doesn't break whenno rows joinedNothing matched rule 1, soapply rule 2Node 34set rowidto "Row0"to create a standardcolumn name (once table is transposed)that can be used in the Joinernothing matched rule 2 soreturn a dummywhich won't causeany rows to be removedNode 38Node 39row keysstamp originalrowid back onwinning rowsNode 43Top: DBottom RJoin R to Don same code and month?RULE 1Join R to Don same code ?IGNORE MONTHRULE 2first rowfirst rowRemovesthe two rowsthat matcheda ruleExcel Reader String to Date&Time Date&Time to String Group Loop Start RecursiveLoop Start Sorter Column Filter RowID Transpose Recursive Loop End Rule-basedRow Splitter Loop End Extract TableDimension Empty Table Switch Table Creator Add Empty Rows CASE Switch End RowID Empty Table Switch Add Empty Rows CASE Switch End Column Filter RowID Column Filter Row Splitter Joiner Joiner Row Filter Row Filter Joiner Row to be removed by application of Rule 1 Row to be removed by application of Rule 2 Nothing to remove If at the end of this iteration, there is either 1 row left, or the same numberof rows as we had at the start of the iteration (ie nothing removed), therows will go to the upper port (collection). Otherwise the rows will all go to the lower port (recursion) forprocessing on the next iteration APPLY RULES REMOVEMATCHING ROW RETURN ROWS OR TRYAGAIN (NEW ITERATION) ITERATEIDENTIFIERS PREPARE DATA ITERATE THEEND! Node 1Node 2YYYY-MMgroup by IDENTIFIERprocess all rowsfor the grouprecursively iterate until no further rows can be removed from the groupby the rulesascending transactiondate (ie erliest first}row keyscreate a rowkey columnbased on rowidRowIDs/Keysto be removedfrom the setrecursively process therows for a groupif only 1 row remains, send it tothe collector.Otherwise send them all back againiterate each groupGet Initial Loop CountNode 31create a dummy entry so laterjoiner doesn't break whenno rows joinedNothing matched rule 1, soapply rule 2Node 34set rowidto "Row0"to create a standardcolumn name (once table is transposed)that can be used in the Joinernothing matched rule 2 soreturn a dummywhich won't causeany rows to be removedNode 38Node 39row keysstamp originalrowid back onwinning rowsNode 43Top: DBottom RJoin R to Don same code and month?RULE 1Join R to Don same code ?IGNORE MONTHRULE 2first rowfirst rowRemovesthe two rowsthat matcheda ruleExcel Reader String to Date&Time Date&Time to String Group Loop Start RecursiveLoop Start Sorter Column Filter RowID Transpose Recursive Loop End Rule-basedRow Splitter Loop End Extract TableDimension Empty Table Switch Table Creator Add Empty Rows CASE Switch End RowID Empty Table Switch Add Empty Rows CASE Switch End Column Filter RowID Column Filter Row Splitter Joiner Joiner Row Filter Row Filter Joiner

Nodes

Extensions

Links