Icon

01_​BERT_​Sentiment_​Analysis

Sentiment Analysis with BERT

This workflow demonstrates how to do sentiment analysis by fine-tuning Google's BERT network.
The idea is straight forward: A small classification MLP is applied on top of BERT which is downloaded from TensorFlow Hub.
The full network is then trained end-to-end on the task at hand.
After 1 epoch of training, the network should already have more than 85% accuracy on the test set.
Once training is completed, the "Visualize Before vs After" component shows the difference between BERT embeddings before and after training. You should see that the training introduced a much clearer separation between the classes. The view also allows to interactively play with different classification thresholds.

The dataset used here consists of the first 10000 reviews in the IMDB Movie Reviews dataset (http://ai.stanford.edu/~amaas/data/sentiment/) from "Learning Word Vectors for Sentiment Analysis" by Maas et al.
If you want to train a better model, we recommend to download the full dataset and train on it instead of the subset that comes with the workflow.

Additional Notes:

The red flow variable connections are used to enforce a sequential execution of nodes that make use of TensorFlow in order to prevent memory issues (especially if you are using a GPU).

If you wish to track your training progress, you can go to File->Preferences->KNIME->KNIME GUI and set the console log level to Info. Then you can monitor the status of the training in the console view (typically at the bottom right of the KNIME workbench).

Required KNIME extensions:

- KNIME Python Integration
- KNIME Deep Learning - Keras Integration
- KNIME Deep Learning - TensorFlow 2 Integration
- KNIME Statistics Nodes (Labs)
- KNIME Machine Learning Interpretability Extension

Required Python packages (need to be available in your TensorFlow 2 Python environment):

- tensorflow_hub
- bert-for-tf2

URL: TensorFlow Hub https://tfhub.dev/
URL: BERT Blog Post https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html
URL: Link to full dataset http://ai.stanford.edu/~amaas/data/sentiment/
URL: Link to dataset paper http://ai.stanford.edu/~amaas/papers/wvSent_acl2011.pdf

This workflow trains a sentiment classifier for English movie reviews that is based on Google's BERT model. For more information see the workflow metadata. Find it here: View -> Description
Prepares reviewsfor training
Preprocessing
CSV Reader
Open the dialogto specify the numberof epochs and thebatch size
Train BERT Classifier
Predicts review sentiments and retrieves BERT embeddings after training
TensorFlow 2 Network Executor
Retrieves BERTembeddings before training
TensorFlow 2 Network Executor
Visualize Before vs After
BERT TensorFlowHub URL
String Configuration
Loads BERT fromTF Hub and adds a small classifieron top of it
DL Python Network Creator

Nodes

Extensions

Links