Icon

07_​Sentiment_​Classification_​with_​NGrams

Sentiment Analysis (Classification) of Documents with NGram Features

The workflow reads textual data from a csv file and converts the strings into documents. The documents are then preprocessed, i.e. filtered and stemmed. The preprocessing magic takes place in the Preprocessing metanode. In the Feature Creation metanode two kinds of feature sets and document vectors are created. The top set of vectors contains only single word features the bottom set of vectors contains single word and 2-gram features.

After the document vectors have been created the sentiment class is extracted and two predictive models are built and scored. One model based only on single word features and the second model based on single word and 2-gram features. Bothe models are compared in the ROC curve node.

1-gram features 1- and 2-gram features This workflow trains two model for sentiment analysis, one using 1-grams and the second using 1-grams and 2-grams, and compars there performance. Data Import and Preprocessing Model Comparison Read IMDb reviewsfrom CSV filePreprocessing of documentsTransformation of strings to documentsExtract sentimentlabelCreation of document vectorsof frequent 1grams and 2gramsTraining / test setApply decisiontree modelScore decisiontree modelTraining / test setExtract sentimentlabelApply decisiontree modelScore decisiontree modelJoin classprobabilitiesScore decisiontree models File Reader Preprocessing Document Creation Category To Class Feature Creation Partitioning Decision TreePredictor Scorer Partitioning Category To Class Decision TreePredictor Scorer Joiner DecisionTree Learner DecisionTree Learner ROC Curve 1-gram features 1- and 2-gram features This workflow trains two model for sentiment analysis, one using 1-grams and the second using 1-grams and 2-grams, and compars there performance. Data Import and Preprocessing Model Comparison Read IMDb reviewsfrom CSV filePreprocessing of documentsTransformation of strings to documentsExtract sentimentlabelCreation of document vectorsof frequent 1grams and 2gramsTraining / test setApply decisiontree modelScore decisiontree modelTraining / test setExtract sentimentlabelApply decisiontree modelScore decisiontree modelJoin classprobabilitiesScore decisiontree models File Reader Preprocessing Document Creation Category To Class Feature Creation Partitioning Decision TreePredictor Scorer Partitioning Category To Class Decision TreePredictor Scorer Joiner DecisionTree Learner DecisionTree Learner ROC Curve

Nodes

Extensions

Links