Icon

01_​Text_​Processing

This directory contains 27 workflows.

Icon01_​Document_​clustering 

The goal of this workflow is to cluster a set of newsgroup documents into their corresponding topic. The data is taken from the 20 newsgroups dataset. The […]

Icon02_​Document_​Classification 

Document Classification: Model Training and Deployment The goal of this workflow is to do spam classification using YouTube comments as the dataset. The […]

Icon03_​Sentiment_​Classification 

Sentiment Analysis (Classification) of Documents This workflow shows how to import text from a csv file, convert it to documents, preprocess the documents […]

Icon04_​Dictionary_​based_​Tagging 

The workflow starts reads ten famous fairy tales of the Brothers Grimm and two dictionaries containing the names or part of the names. Branch 1: The words […]

Icon05_​Named_​Entity_​Tag_​Cloud 

The workflow starts with a list of documents, which have been downloaded from PubMed and parsed beforehand and saved as data table. The data is available in […]

Icon06_​NY_​Times_​RSS_​Feed_​Tag_​Cloud 

The workflow starts with a URL to a NY Times rss news feed. The news feed is downloaded and parsed and transformed in DocumentCells. Names of persons, […]

Icon07_​Sentiment_​Classification_​with_​NGrams 

The workflow reads textual data from a csv file and converts the strings into documents. The documents are then preprocessed, i.e. filtered and stemmed. The […]

Icon08_​Streaming_​Sentiment_​Classification 

We use the KNIME Simple Streaming nodes to do the first part of the text processing. See the first wrapped metanode. To enable streaming (for streaming […]

Icon09_​Fuzzy_​String_​Matching 

This workflow demonstrates how to apply a fuzzy matching of two string. The string matcher was designed exactly for this task, but is limited to the […]