This workflow shows how to import text from a CSV file, convert it to documents, pre-process the documents and show how to visualize a tag cloud based on positive and negative terms.
Topic Detection Analysis - Movie Reviews
Topic detection extracts relevant information elements from unstructured text documents and groups them to define some topics. This workflow illustrates how to perform a topic detection analysis on movie reviews.
Task. Perform topic detection in IMDb reviews.
1 - Data Reading
Read IMDb reviews from a CSV file.
The file is located in TheData/SocialMedia.
2 - Pre-processing
- Classic pre-processing of documents: Punctuation Erasure, Number Filter, N Chars Filter, Stop Word Filter, Case Converter
Double-click the metanode to see the sub-workflow
3 - Topic Detection
Build a list of topics of the pre-processed documents using the Topic-Extractor (Parallel LDA) node. Use four words for each topic and eight topics.
Try this:
1) Go to the configuration window of the Topic Extractor (Parallel LDA)
2) Try to change the number of words and topics you want to detect in the document.
4 - Grouping
The GroupBy node concatenates the keywords for the identified topics.
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.