This directory contains 27 workflows.
The goal of this workflow is to cluster a set of newsgroup documents into their corresponding topic. The data is taken from the 20 newsgroups dataset. The […]
Document Classification: Model Training and Deployment The goal of this workflow is to do spam classification using YouTube comments as the dataset. The […]
This workflow shows how to import text from a csv file, convert it to documents, preprocess the documents and transform them into numerical document […]
The workflow starts reads ten famous fairy tales of the Brothers Grimm and two dictionaries containing the names or part of the names. Branch 1: The words […]
The workflow starts with a list of documents, which have been downloaded from PubMed and parsed beforehand and saved as data table. The data is available in […]
The workflow starts with a URL to a NY Times rss news feed. The news feed is downloaded and parsed and transformed in DocumentCells. Names of persons, […]
The workflow reads textual data from a csv file and converts the strings into documents. The documents are then preprocessed, i.e. filtered and stemmed. The […]
We use the KNIME Simple Streaming nodes to do the first part of the text processing. See the first wrapped metanode. To enable streaming (for streaming […]
This workflow demonstrates how to apply a fuzzy matching of two string. The string matcher was designed exactly for this task, but is limited to the […]