Icon

Simple Fuzzy Aggregation

<p>This workflow demonstrates the concept of <strong>Fuzzy Aggregation</strong>. It shows how to reliably aggregate numerical data when categorical fields such as city names are inconsistent or contain spelling variations.</p><p>Two input datasets are used:</p><ul><li><p>A <strong>City Sales Dataset with Typos</strong>, containing sales values per city but with multiple misspellings (e.g., <em>Munchen, Muenchen, München, Berllin</em>).</p></li><li><p>A <strong>Correct City Names Dataset</strong>, serving as the standardized reference.</p></li></ul><p>The <strong>Approximate String Matcher</strong> node maps each noisy city name to its closest correct counterpart. Once the mapping is complete, the <strong>GroupBy</strong> node aggregates sales per standardized city name.</p><p>This ensures that all sales values are counted under the correct city, even when the original input contained duplicates, variants, or errors.</p><p>The workflow highlights how exorbyte’s matching technology can be used to guarantee <strong>accurate and trustworthy aggregations</strong> in real-world scenarios where source data is messy or inconsistent.</p>

URL: exorbyte GmbH https://www.exorbyte.com/en
URL: Approximate String Matcher Node https://hub.knime.com/exorbyte-team/extensions/com.exorbyte.knime.matchmaker.toolbox.feature/latest/com.exorbyte.knime.matchmaker.toolbox.impl.ApproximateStringMatcherNodeFactory

Nodes

Extensions

Links