Icon

12_​DocumentVector_​FeatureSpaceAdaption

Sentiment Analysis of Documents using Document Vector Adapter

The workflow shows how to use a Document Vector Adapter node in order to adjust the feature space of a second set of documents to make it identical to the feature space of a first, reference set of documents.

It starts with reading textual data from a csv file and partitioning them into training and test data set. The sets are converted into documents, which are then preprocessed, i.e. filtered and stemmed and transformed into numerical document vectors. To make sure that the feature space of the test set is identical to the feature set of the training set, the Document Vector Applier node is applied. After the respective document vectors have been created the sentiment class is extracted and a predictive model is built and scored.

Reading and partitioning of data Preprocessing Classification The workflow shows how to use a Document Vector Applier node to create a feature space based on the bag of words of the training data. Extract sentimentlabelColor by sentimentlabelTraining / test setApply decisiontree modelScore decisiontree modelTransformation of strings to documentsRead IMDb reviewsfrom CSV filePreprocessing of documentsBuild predictive modelTransformation of strings to documentsPreprocessing of documentsExtract sentimentlabelColor by sentimentlabelFiltering basedon occurencesCreation of BoW and TFCreate bit vectorsfor documentsCreate term vector of the test set with identicalfeature space of thetraining set Category To Class Color Manager Partitioning Decision TreePredictor Scorer Document Creation File Reader Preprocessing DecisionTree Learner Document Creation Preprocessing Category To Class Color Manager Preprocessing II Preprocessing II Document Vector DocumentVector Applier Reading and partitioning of data Preprocessing Classification The workflow shows how to use a Document Vector Applier node to create a feature space based on the bag of words of the training data. Extract sentimentlabelColor by sentimentlabelTraining / test setApply decisiontree modelScore decisiontree modelTransformation of strings to documentsRead IMDb reviewsfrom CSV filePreprocessing of documentsBuild predictive modelTransformation of strings to documentsPreprocessing of documentsExtract sentimentlabelColor by sentimentlabelFiltering basedon occurencesCreation of BoW and TFCreate bit vectorsfor documentsCreate term vector of the test set with identicalfeature space of thetraining set Category To Class Color Manager Partitioning Decision TreePredictor Scorer Document Creation File Reader Preprocessing DecisionTree Learner Document Creation Preprocessing Category To Class Color Manager Preprocessing II Preprocessing II Document Vector DocumentVector Applier

Nodes

Extensions

Links