01_Text_Processing

This directory contains 27 workflows.

20_Blending_Languages

01_Document_clustering

The goal of this workflow is to cluster a set of newsgroup documents into their corresponding topic. The data is taken from the 20 newsgroups dataset. The […]

02_Document_Classification

Document Classification: Model Training and Deployment The goal of this workflow is to do spam classification using YouTube comments as the dataset. The […]

03_Sentiment_Classification

Sentiment Analysis (Classification) of Documents This workflow shows how to import text from a csv file, convert it to documents, preprocess the documents […]

04_Dictionary_based_Tagging

The workflow starts reads ten famous fairy tales of the Brothers Grimm and two dictionaries containing the names or part of the names. Branch 1: The words […]

05_Named_Entity_Tag_Cloud

The workflow starts with a list of documents, which have been downloaded from PubMed and parsed beforehand and saved as data table. The data is available in […]

06_NY_Times_RSS_Feed_Tag_Cloud

The workflow starts with a URL to a NY Times rss news feed. The news feed is downloaded and parsed and transformed in DocumentCells. Names of persons, […]

07_Sentiment_Classification_with_NGrams

The workflow reads textual data from a csv file and converts the strings into documents. The documents are then preprocessed, i.e. filtered and stemmed. The […]

08_Streaming_Sentiment_Classification

We use the KNIME Simple Streaming nodes to do the first part of the text processing. See the first wrapped metanode. To enable streaming (for streaming […]

09_Fuzzy_String_Matching

This workflow demonstrates how to apply a fuzzy matching of two string. The string matcher was designed exactly for this task, but is limited to the […]

01_​Text_​Processing

01_Text_Processing