This workflow showcases different ways to utilize the "String Cleaner" node. The "String Cleaner" node can be found in the node repository under the Manipulation category.
You can easily download and run the workflow directly in your KNIME installation. We recommend that you use the latest version of the KNIME Analytics Platform for optimal performance. It can also be deployed as a Data App in KNIME Business Hub.
The "String Cleaner" node allows you to perform various data-cleaning operations:
- You can remove accents and diacritics such as "áéíóúüñ".
- Delete letters or numbers from a string value.
- Eliminate non-ASCII characters like "ø°½¹".
- Get rid of special characters such as "(){}[]<>|/\\",
- It can handle line breaks, leading space, and other common issues.
We created a synthetic dataset to mock personal data, including name, surname, email, telephone, and webpage. The data contained inconsistencies that can be handled with the String Cleaner node.
After cleaning data using the "String Cleaner" node, use the Table View nodes to compare with the initial input.
URL: Synthetic Data Wikipedia https://en.wikipedia.org/wiki/Synthetic_data
URL: Blog: Generate synthetic data to teach machine learning https://www.knime.com/blog/generate-synthetic-data-teach-ml
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.