Simple Fuzzy Aggregation

This workflow demonstrates the concept of Fuzzy Aggregation. It shows how to reliably aggregate numerical data when categorical fields such as city names are inconsistent or contain spelling variations.Two input datasets are used:<ul><li>A City Sales Dataset with Typos, containing sales values per city but with multiple misspellings (e.g., Munchen, Muenchen, München, Berllin).</li><li>A Correct City Names Dataset, serving as the standardized reference.</li></ul>The Approximate String Matcher node maps each noisy city name to its closest correct counterpart. Once the mapping is complete, the GroupBy node aggregates sales per standardized city name.This ensures that all sales values are counted under the correct city, even when the original input contained duplicates, variants, or errors.The workflow highlights how exorbyte’s matching technology can be used to guarantee accurate and trustworthy aggregations in real-world scenarios where source data is messy or inconsistent.

URL: exorbyte GmbH https://www.exorbyte.com/en
URL: Approximate String Matcher Node https://hub.knime.com/exorbyte-team/extensions/com.exorbyte.knime.matchmaker.toolbox.feature/latest/com.exorbyte.knime.matchmaker.toolbox.impl.ApproximateStringMatcherNodeFactory

Nodes

CSV Reader2 ×
Approximate String Matcher1 ×
GroupBy1 ×

Extensions

FeatureKNIME Base nodes

Download

To use this workflow in KNIME, download it from the below URL and open it in KNIME:

Download Workflow

Created by: Ahmad.Varasteh

Created at: 2025-08-26

On NodePit since: 2025-08-27

Last update: 2026-03-03

Created with KNIME version: v5.5.1

Tags: fuzzy matchingapproximate string matchingdata cleaningtypo correctionfuzzy aggregation

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!

Simple Fuzzy Aggregation

Nodes

Extensions

Links

Download