Icon

Example 4 - Building Sentiment Predictor - BERT

Building Sentiment Predictor - BERT

This workflow uses a Kaggle Dataset, including 14K customer tweets towards six US airlines (https://www.kaggle.com/crowdflower/twitter-airline-sentiment). Contributors annotated the valence of the tweets as positive, negative, and neutral. Once users are satisfied with the model evaluation, they should export the trained BERT model for deployment to classify non-annotated data.

If you use this workflow, please cite:
F. Villaroel Ordenes & R. Silipo, “Machine learning for marketing on the KNIME Hub: The development of a live repository for marketing applications”, Journal of Business Research 137(1):393-410, DOI: 10.1016/j.jbusres.2021.08.036.

1. Read annotatedtwitter dataset. 2. Data Manipulation/Preparation. Simplified process of datamanipulation/preparation. Users could also add more pre-processing using text mining extension 3. Upload, Train, and Apply a BERT model. To execute this part of theworkflow, users need to have a dedicated Python environment and thenmust select File->Preferences->KNIME->Python deep learning->Tensorflow 2. The enviroment needs to have the following packages installed:https://hub.knime.com/redfield/extensions/se.redfield.bert.feature/latest 4. Evaluate the Model.This part of the workflowwrites the trained modelto a file and validates itsaccuracy. Building a Sentiment Analysis Predictive Model - BERT This workflow uses a Kaggle Dataset, including 14K customer tweets towards six US airlines (https://www.kaggle.com/crowdflower/twitter-airline-sentiment). Contributors annotated the valence ofthe tweets as positive, negative, and neutral. Once users are satisfied with the model evaluation, they should export the trained BERT model for deployment to classify non-annotated data. Workflow source: https://hub.knime.com/knime/spaces/Machine%20Learning%20and%20Marketing/latest/Sentiment%20Analysis~-z5F8gFJ9Fm1756a/F. Villaroel Ordenes & R. Silipo, “Machine learning for marketing on the KNIME Hub: The development of a live repository for marketingapplications”, Journal of Business Research 137(1):393-410, DOI: 10.1016/j.jbusres.2021.08.036. Before using create a folder in your laptop in which upload BERTmodellower case textWith fine tuningKaggle DatasetN=14640Tweets fromconsumers toairlinesOnly text and category80% training20% testing83% Accuracy BERT Model Selector String Manipulation DuplicateRow Filter BERT ClassificationLearner CSV Reader Column Filter Partitioning Scorer BERT Predictor 1. Read annotatedtwitter dataset. 2. Data Manipulation/Preparation. Simplified process of datamanipulation/preparation. Users could also add more pre-processing using text mining extension 3. Upload, Train, and Apply a BERT model. To execute this part of theworkflow, users need to have a dedicated Python environment and thenmust select File->Preferences->KNIME->Python deep learning->Tensorflow 2. The enviroment needs to have the following packages installed:https://hub.knime.com/redfield/extensions/se.redfield.bert.feature/latest 4. Evaluate the Model.This part of the workflowwrites the trained modelto a file and validates itsaccuracy. Building a Sentiment Analysis Predictive Model - BERT This workflow uses a Kaggle Dataset, including 14K customer tweets towards six US airlines (https://www.kaggle.com/crowdflower/twitter-airline-sentiment). Contributors annotated the valence ofthe tweets as positive, negative, and neutral. Once users are satisfied with the model evaluation, they should export the trained BERT model for deployment to classify non-annotated data. Workflow source: https://hub.knime.com/knime/spaces/Machine%20Learning%20and%20Marketing/latest/Sentiment%20Analysis~-z5F8gFJ9Fm1756a/F. Villaroel Ordenes & R. Silipo, “Machine learning for marketing on the KNIME Hub: The development of a live repository for marketingapplications”, Journal of Business Research 137(1):393-410, DOI: 10.1016/j.jbusres.2021.08.036. Before using create a folder in your laptop in which upload BERTmodellower case textWith fine tuningKaggle DatasetN=14640Tweets fromconsumers toairlinesOnly text and category80% training20% testing83% Accuracy BERT Model Selector String Manipulation DuplicateRow Filter BERT ClassificationLearner CSV Reader Column Filter Partitioning Scorer BERT Predictor

Nodes

Extensions

Links