Icon

Search reference in different columns-takbb 3

Adjust the Values in RowFilter to include/excludemore based on"similarity" Add or remove any manual rulesthat you might consider. As anexample Included a rule that saidif City is present in both, but isnot the same, then filter out therow (even though the rest of theaddress may be the same) More problematic might be how to deal with "street number"difference, where the only difference in the strings is the buildingnumber!You may need to apply additional logic, but at least you have areduced data set on which to apply such rules Table 1Table 2Create Textfrom the fieldsto be searchedLet's count howmany "key words"were foundLet's keep onlythe one withthe highestkey word countMatch column valueswe don't care about the "columnName" theycame from as the data is notconsistentCreate our ownrow id seq numberfor future referencingCreate our ownrow id seq numberfor future referencingTurn into "key value pairs"Just to IDsThis is list of "potential" matches, no"definite" matchesPull in original datafrom Table1Pull in original datafrom Table2Bring "potentially matching" data togetherfrom Table1 and Table2Tab1IDTab2IDOrigTab2IDOrigTab1IDUnique concatenated string of alladdress componentsUnique concatenated string of alladdress componentsTidy upSort into order of table 1Turn into "key value pairs"Include rows that we didn't establish a match forGet unique set from Table1(ignore duplicates caused by potential match to multiple Table2 rows)Count matches for each Tab1-Tab2combinationOnly include wheretables match on more than 1address element?Node 52Let's replace themissing values withempty string Table Creator Table Creator String Manipulation Column Expressions DuplicateRow Filter Joiner Counter Generation Counter Generation Unpivoting Column Filter Joiner Joiner Joiner Column Rename Column Rename Column Rename Column Rename Column Aggregator Column Aggregator Column Filter Sorter Unpivoting Joiner DuplicateRow Filter GroupBy Row Filter Column Filter Missing Value Adjust the Values in RowFilter to include/excludemore based on"similarity" Add or remove any manual rulesthat you might consider. As anexample Included a rule that saidif City is present in both, but isnot the same, then filter out therow (even though the rest of theaddress may be the same) More problematic might be how to deal with "street number"difference, where the only difference in the strings is the buildingnumber!You may need to apply additional logic, but at least you have areduced data set on which to apply such rules Table 1Table 2Create Textfrom the fieldsto be searchedLet's count howmany "key words"were foundLet's keep onlythe one withthe highestkey word countMatch column valueswe don't care about the "columnName" theycame from as the data is notconsistentCreate our ownrow id seq numberfor future referencingCreate our ownrow id seq numberfor future referencingTurn into "key value pairs"Just to IDsThis is list of "potential" matches, no"definite" matchesPull in original datafrom Table1Pull in original datafrom Table2Bring "potentially matching" data togetherfrom Table1 and Table2Tab1IDTab2IDOrigTab2IDOrigTab1IDUnique concatenated string of alladdress componentsUnique concatenated string of alladdress componentsTidy upSort into order of table 1Turn into "key value pairs"Include rows that we didn't establish a match forGet unique set from Table1(ignore duplicates caused by potential match to multiple Table2 rows)Count matches for each Tab1-Tab2combinationOnly include wheretables match on more than 1address element?Node 52Let's replace themissing values withempty stringTable Creator Table Creator String Manipulation Column Expressions DuplicateRow Filter Joiner Counter Generation Counter Generation Unpivoting Column Filter Joiner Joiner Joiner Column Rename Column Rename Column Rename Column Rename Column Aggregator Column Aggregator Column Filter Sorter Unpivoting Joiner DuplicateRow Filter GroupBy Row Filter Column Filter Missing Value

Nodes

Extensions

Links