Icon

Fuzzy Address Matching

Ever got annoyed that you cannot join two datasets on strings because they aren't an exact match? Fuzzy matching to the rescue! This workflow snippet demonstrates how the String Matcher node can be used to realize such a fuzzy matching without you knowing anything about the Levenshtein distance (it is still worth looking it up)

This workflow snippet shows how fuzzy joining can be realized on the example of partly matching addresses. addresses 1addresses 2do the fuzzy matchingfilter bad matchesThis thresholddepends on your case.Here "10" makes sense, which roughly means that at most 10characters in the keys areallowed to be different(out of the 20-30 characters in the key)join in address2join in address1,filter columnsonly lowercase alphanumeric leftlowercase alphanumeric leftIts critical to adjust thepreprocessing to your usecase! CSV Reader CSV Reader String Matcher Row Filter Joiner Joiner String Manipulation String Manipulation This workflow snippet shows how fuzzy joining can be realized on the example of partly matching addresses. addresses 1addresses 2do the fuzzy matchingfilter bad matchesThis thresholddepends on your case.Here "10" makes sense, which roughly means that at most 10characters in the keys areallowed to be different(out of the 20-30 characters in the key)join in address2join in address1,filter columnsonly lowercase alphanumeric leftlowercase alphanumeric leftIts critical to adjust thepreprocessing to your usecase! CSV Reader CSV Reader String Matcher Row Filter Joiner Joiner String Manipulation String Manipulation

Nodes

Extensions

Links