Icon

Decision Tree Models

Sentiment Analysis (Classification) of Documents

This workflow shows how to import text from a CSV file, convert it to documents, pre-process the documents and show how to visualize a tag cloud based on positive and negative terms.

Topic Detection Analysis - Movie Reviews

Topic detection extracts relevant information elements from unstructured text documents and groups them to define some topics. This workflow illustrates how to perform a topic detection analysis on movie reviews.

Task. Perform topic detection in IMDb reviews.

1 - Data Reading

Read IMDb reviews from a CSV file.
The file is located in TheData/SocialMedia.

2 - Pre-processing

- Classic pre-processing of documents: Punctuation Erasure, Number Filter, N Chars Filter, Stop Word Filter, Case Converter

Double-click the metanode to see the sub-workflow

3 - Topic Detection

Build a list of topics of the pre-processed documents using the Topic-Extractor (Parallel LDA) node. Use four words for each topic and eight topics.

Try this:

1) Go to the configuration window of the Topic Extractor (Parallel LDA)
2) Try to change the number of words and topics you want to detect in the document.

4 - Grouping

The GroupBy node concatenates the keywords for the identified topics.

Nodes

Extensions

Links