Icon

Sentiment_​Analysis_​using_​RNN

Building a Sentiment Analysis Predictive Model - Deep Learning using an RNN

This workflow uses a Kaggle Dataset, including 14K customer tweets towards six US airlines: https://www.kaggle.com/crowdflower/twitter-airline-sentiment. Contributors annotated the valence of the tweet into positive, negative and neutral. Once users are satisfied with the model evaluation, they should export 1) Dictionary, 2) Category to Number Model, 3) Trained Network for deployment in non-annotated data.

Reference:
F. Villaroel Ordenes & R. Silipo, “Machine learning for marketing on the KNIME Hub: The development of a live repository for marketing applications”, Journal of Business Research 137(1):393-410, DOI: 10.1016/j.jbusres.2021.08.036.

Building a Sentiment Analysis Predictive Model - Deep Learning using an Recurrent Neural Network (RNN)This workflow uses a Kaggle Dataset, including 14K customer tweets towards six US airlines: https://www.kaggle.com/crowdflower/twitter-airline-sentiment. Contributors annotated the valence of the tweet into positive, negative and neutral. Once users are satisfied with the model evaluation, they should export 1) Dictionary,2) Category to Number Model, 3) Trained Network for deployment to classify non-annotated data. 2. Read annotatedtwitter dataset. 1. Define the Network Architecture The Keras Layer nodes define an LSTM based recurrent neural network. The network structure can beextended by adding more Keras Layer nodes. 3. Manipulate and Encode DataThe metanode performs an index encoding to encode each word with an index. This blog post describes differentencoding options https://www.knime.com/blog/text-encoding-a-reviewIn general recurrent neural networks can handle sequences with different lengths. During training though, allsequences must have the same length. Therefore, the metanode adds zeros to the end of the sequences, so thatall sequences have the same length. This approach is known as zero padding. 4. Train and Apply Network The Keras Network Learner node trains the defined network. In theconfiguration window you can define the input column(s), target column(s), the loss function, and thetraining parameters, e.g. number of epochs, batch size, and optimizer. The Keras Network Executornode applies the trained network to the input data. In the configuration window you can select the inputcolumn(s) and define the output by clicking on the "add output" button. In this worklflow the softmaxoutput layer is defined as output. This means that the output are the probabilities for the three classes. 5. Evaluate and Save Trained Network The metanodeExtract Predictions uses the probabilities produced by theKeras Network Executor node and extracts the class with thehighest probability. Kaggle DatasetN=14640Tweets fromconsumers toairlinesLoss function:Categorical CrossEntropyEpochs: 50 80% training20% testingEncode eachclass with an index Output: Softmax layer=> Probability for the different classes74% AccuracyInput: # word in dictionaryOutput: 128 unitsSoftmax with 3 unitsNote: An appropriate output layer for amulticlass classification task is a softmax layer with as many unitsas classes.Units for cellstate: 256Shape: ?Note: Using ? as inputshape allows to handle different sequence lengthsClass withhighest probabilityCSV Reader Create CollectionColumn Keras NetworkLearner Partitioning Category To Number Keras NetworkExecutor Scorer Keras EmbeddingLayer Keras Dense Layer Keras LSTM Layer Keras Input Layer Index encodingand zero padding Extract Prediction Building a Sentiment Analysis Predictive Model - Deep Learning using an Recurrent Neural Network (RNN)This workflow uses a Kaggle Dataset, including 14K customer tweets towards six US airlines: https://www.kaggle.com/crowdflower/twitter-airline-sentiment. Contributors annotated the valence of the tweet into positive, negative and neutral. Once users are satisfied with the model evaluation, they should export 1) Dictionary,2) Category to Number Model, 3) Trained Network for deployment to classify non-annotated data. 2. Read annotatedtwitter dataset. 1. Define the Network Architecture The Keras Layer nodes define an LSTM based recurrent neural network. The network structure can beextended by adding more Keras Layer nodes. 3. Manipulate and Encode DataThe metanode performs an index encoding to encode each word with an index. This blog post describes differentencoding options https://www.knime.com/blog/text-encoding-a-reviewIn general recurrent neural networks can handle sequences with different lengths. During training though, allsequences must have the same length. Therefore, the metanode adds zeros to the end of the sequences, so thatall sequences have the same length. This approach is known as zero padding. 4. Train and Apply Network The Keras Network Learner node trains the defined network. In theconfiguration window you can define the input column(s), target column(s), the loss function, and thetraining parameters, e.g. number of epochs, batch size, and optimizer. The Keras Network Executornode applies the trained network to the input data. In the configuration window you can select the inputcolumn(s) and define the output by clicking on the "add output" button. In this worklflow the softmaxoutput layer is defined as output. This means that the output are the probabilities for the three classes. 5. Evaluate and Save Trained Network The metanodeExtract Predictions uses the probabilities produced by theKeras Network Executor node and extracts the class with thehighest probability. Kaggle DatasetN=14640Tweets fromconsumers toairlinesLoss function:Categorical CrossEntropyEpochs: 50 80% training20% testingEncode eachclass with an index Output: Softmax layer=> Probability for the different classes74% AccuracyInput: # word in dictionaryOutput: 128 unitsSoftmax with 3 unitsNote: An appropriate output layer for amulticlass classification task is a softmax layer with as many unitsas classes.Units for cellstate: 256Shape: ?Note: Using ? as inputshape allows to handle different sequence lengthsClass withhighest probabilityCSV Reader Create CollectionColumn Keras NetworkLearner Partitioning Category To Number Keras NetworkExecutor Scorer Keras EmbeddingLayer Keras Dense Layer Keras LSTM Layer Keras Input Layer Index encodingand zero padding Extract Prediction

Nodes

Extensions

Links