Icon

String Cleaner Example

There has been no title set for this workflow's metadata.

This workflow showcases different ways to utilize the "String Cleaner" node. The "String Cleaner" node can be found in the node repository under the Manipulation category.

You can easily download and run the workflow directly in your KNIME installation. We recommend that you use the latest version of the KNIME Analytics Platform for optimal performance. It can also be deployed as a Data App in KNIME Business Hub.

The "String Cleaner" node allows you to perform various data-cleaning operations:

- You can remove accents and diacritics such as "áéíóúüñ".

- Delete letters or numbers from a string value.

- Eliminate non-ASCII characters like "ø°½¹".

- Get rid of special characters such as "(){}[]<>|/\\",

- It can handle line breaks, leading space, and other common issues.

We created a synthetic dataset to mock personal data, including name, surname, email, telephone, and webpage. The data contained inconsistencies that can be handled with the String Cleaner node.

After cleaning data using the "String Cleaner" node, use the Table View nodes to compare with the initial input.

URL: Synthetic Data Wikipedia https://en.wikipedia.org/wiki/Synthetic_data
URL: Blog: Generate synthetic data to teach machine learning https://www.knime.com/blog/generate-synthetic-data-teach-ml

Nodes

Extensions

Links