Icon

BBC Documents multiclass classification with BERT extension

BBC Documents classification with BERT extension
This workflow trains a multiclass classifier using the Redfield BERT Nodes extension based on Google's BERT model. It then assigns text exerpts with one ofthe five categories: business, entertainment, politics, sport, tech. The BBC data set can be found on Kaggle: https://www.kaggle.com/shivamkushwaha/bbc-full-text-document-classificationFor more information see the workflow metadata. Find it here: View -> Description Pre-processing: - lowercase the input text- split data to training, validation and test samples Data upload: zip archivefrom https://www.kaggle.com/shivamkushwaha/bbc-full-text-document-classification BERT model upload Training the classifier Prediction of categories based either on newly trainedmodel or on the saved and uploaded model Performance analysis andcomparison upload BERTmodellower case text80 training20 test80 training20 validationWithout fine tuningWithout fine tuningupload thesaved modelwithoutfine tuningWith fine tuningWith fine tuningsave the modelwithoutfine tuningno fine tuningsave the modelwithfine tuningupload thesaved modelwithfine tuningno fine tuning BERT Model Selector String Manipulation Partitioning Partitioning BERT ClassificationLearner BERT Predictor Model Reader BERT Predictor BERT ClassificationLearner Model Writer Timer Info Upload the data set Model Writer Model Reader Timer Info Evaluation This workflow trains a multiclass classifier using the Redfield BERT Nodes extension based on Google's BERT model. It then assigns text exerpts with one ofthe five categories: business, entertainment, politics, sport, tech. The BBC data set can be found on Kaggle: https://www.kaggle.com/shivamkushwaha/bbc-full-text-document-classificationFor more information see the workflow metadata. Find it here: View -> Description Pre-processing: - lowercase the input text- split data to training, validation and test samples Data upload: zip archivefrom https://www.kaggle.com/shivamkushwaha/bbc-full-text-document-classification BERT model upload Training the classifier Prediction of categories based either on newly trainedmodel or on the saved and uploaded model Performance analysis andcomparison upload BERTmodellower case text80 training20 test80 training20 validationWithout fine tuningWithout fine tuningupload thesaved modelwithoutfine tuningWith fine tuningWith fine tuningsave the modelwithoutfine tuningno fine tuningsave the modelwithfine tuningupload thesaved modelwithfine tuningno fine tuningBERT Model Selector String Manipulation Partitioning Partitioning BERT ClassificationLearner BERT Predictor Model Reader BERT Predictor BERT ClassificationLearner Model Writer Timer Info Upload the data set Model Writer Model Reader Timer Info Evaluation

Nodes

Extensions

Links