Icon

Topic Detection Based on Movie Reviews

Sentiment Analysis (Classification) of Documents
Topic Detection Analysis - Movie ReviewsTopic detection extracts relevant information elements from unstructured text documents and groups them to define a number of topics. This workflowillustrates how to perform a topic detection analysis on movie reviews.Task. Perform a topic detection in IMDb reviews. Pre-processing - Classic pre-processing of documents:Punctuation Erasure, Number Filter, NChars Filter, Stop Word Filter, CaseConverterDouble-click the metanode to see thesubworkflow Data ReadingRead IMDb reviews from aCSV file.The file is located in TheData/SocialMedia Topic DetectionBuild a list of topics of the pre-processeddocuments using the Topic-Extractor (ParallelLDA) node. Use 4 words for each topic and 8topics. GroupingThe GroupBy node concatenatesthe keywords for the identifiedtopics. Try this:1) Go to the configuration windowof the Topic Extractor (ParallelLDA) 2) Try to change the number ofwords and topics that you wouldlike to detect in the document. 4 words for 8 topicsConc terms fortopicsTransformation of strings to documentsReadingTheData/SocialMedia/IMDb-sample.csv Topic Extractor(Parallel LDA) GroupBy Document Creation anddocument pre-processing CSV Reader Topic Detection Analysis - Movie ReviewsTopic detection extracts relevant information elements from unstructured text documents and groups them to define a number of topics. This workflowillustrates how to perform a topic detection analysis on movie reviews.Task. Perform a topic detection in IMDb reviews. Pre-processing - Classic pre-processing of documents:Punctuation Erasure, Number Filter, NChars Filter, Stop Word Filter, CaseConverterDouble-click the metanode to see thesubworkflow Data ReadingRead IMDb reviews from aCSV file.The file is located in TheData/SocialMedia Topic DetectionBuild a list of topics of the pre-processeddocuments using the Topic-Extractor (ParallelLDA) node. Use 4 words for each topic and 8topics. GroupingThe GroupBy node concatenatesthe keywords for the identifiedtopics. Try this:1) Go to the configuration windowof the Topic Extractor (ParallelLDA) 2) Try to change the number ofwords and topics that you wouldlike to detect in the document. 4 words for 8 topicsConc terms fortopicsTransformation of strings to documentsReadingTheData/SocialMedia/IMDb-sample.csv Topic Extractor(Parallel LDA) GroupBy Document Creation anddocument pre-processing CSV Reader

Nodes

Extensions

Links