Icon

Sentiment_​Classification_​with_​NGrams

There has been no title set for this workflow's metadata.

Sentiment Analysis (Classification) of Documents with NGram Features

This workflow shows how to import text from a csv file, convert it to documents, preprocess the documents and transform them into numerical document vectors consisting of single word and 2-gram features.
Finally two predictive models are trained on the vectors to predict the sentiment class of the documents.

URL: Sentiment Analysis with N-Grams http://www.knime.org/blog/sentiment-analysis-with-n-grams
URL: Slides KNIME Analytics Platform Text Mining https://www.knime.com/form/material-download-registration

Sentiment Analysis (Classification) of Documents with NGram FeaturesThe workflow reads textual data from a csv file and converts the strings into documents. The documents are then preprocessed, i.e. filtered and stemmed. The preprocessing magic takes place in the Preprocessing meta node.In the Feature Creation meta node two kinds of feature sets and document vectors are created. The top set of vectors contains only single word features the bottom set of vectors contains single word and 2-gram features. Afterthe document vectors have been created the sentiment class is extracted and two predictive models are built and scored. One model based only on single word features and the second model based on single word and 2-gramfeatures. 1-gram features 1- and 2-gram features Supplementary Workflow Color by sentimentlabelPreprocessing of documentsTransformation of strings to documentsExtract sentimentlabelCreation of document vectorsof frequent 1grams and 2gramsTraining / test setApply decisiontree modelTraining / test setExtract sentimentlabelColor by sentimentlabelApply decisiontree modelScore decisiontree modelScore decisiontree modelRead IMDb reviewsfrom CSV file Color Manager Preprocessing Document Creation Category To Class Feature Creation Partitioning Decision TreePredictor Partitioning Category To Class Color Manager Decision TreePredictor DecisionTree Learner DecisionTree Learner Scorer Scorer CSV Reader Sentiment Analysis (Classification) of Documents with NGram FeaturesThe workflow reads textual data from a csv file and converts the strings into documents. The documents are then preprocessed, i.e. filtered and stemmed. The preprocessing magic takes place in the Preprocessing meta node.In the Feature Creation meta node two kinds of feature sets and document vectors are created. The top set of vectors contains only single word features the bottom set of vectors contains single word and 2-gram features. Afterthe document vectors have been created the sentiment class is extracted and two predictive models are built and scored. One model based only on single word features and the second model based on single word and 2-gramfeatures. 1-gram features 1- and 2-gram features Supplementary Workflow Color by sentimentlabelPreprocessing of documentsTransformation of strings to documentsExtract sentimentlabelCreation of document vectorsof frequent 1grams and 2gramsTraining / test setApply decisiontree modelTraining / test setExtract sentimentlabelColor by sentimentlabelApply decisiontree modelScore decisiontree modelScore decisiontree modelRead IMDb reviewsfrom CSV fileColor Manager Preprocessing Document Creation Category To Class Feature Creation Partitioning Decision TreePredictor Partitioning Category To Class Color Manager Decision TreePredictor DecisionTree Learner DecisionTree Learner Scorer Scorer CSV Reader

Nodes

Extensions

Links