Icon

JKISeason2-6

There has been no description set for this workflow's metadata.

Challenge 06: Airline ReviewsLevel: HardDescription: You work for a Marketing agency that monitors the online presence of a few airline companies to understandhow they are being reviewed. You were asked to identify whether a tweet mentioning an airline is positive, neutral, ornegative, and decided to implement a simple sentiment analysis classifier for this task. What accuracy can you get whenautomating this process? Is the classifier likely to help company reviewers save their time? Note: Given the size of thedataset, training the classifier may take a little while to execute on your machine (especially if you use more sophisticatedmethods). Feel free to use only a part of the dataset in this challenge if you want to speed up your solution. Hint 1: Checkour Textprocessing extension to learn more about how you can turn tweets' words into features that a classifier canexplore. Hint 2: Study, use, and/or adapt shared components Enrichment and Preprocessing and Document Vectorization(in this order!) if you want to get a part of the work done more quickly. They were created especially for this challenge. Hint3: Remember to partition the dataset into training and test set in order to create the decision tree model and then evaluateit. Feel free to use the partitioning strategy you prefer. Training (Top)Test (Bottom)Accuracy 0.714Cohen's 0.404 DocumentVectorization Enrichment andPreprocessing Table Reader Partitioning DecisionTree Learner Decision Tree View Decision TreePredictor Scorer Challenge 06: Airline ReviewsLevel: HardDescription: You work for a Marketing agency that monitors the online presence of a few airline companies to understandhow they are being reviewed. You were asked to identify whether a tweet mentioning an airline is positive, neutral, ornegative, and decided to implement a simple sentiment analysis classifier for this task. What accuracy can you get whenautomating this process? Is the classifier likely to help company reviewers save their time? Note: Given the size of thedataset, training the classifier may take a little while to execute on your machine (especially if you use more sophisticatedmethods). Feel free to use only a part of the dataset in this challenge if you want to speed up your solution. Hint 1: Checkour Textprocessing extension to learn more about how you can turn tweets' words into features that a classifier canexplore. Hint 2: Study, use, and/or adapt shared components Enrichment and Preprocessing and Document Vectorization(in this order!) if you want to get a part of the work done more quickly. They were created especially for this challenge. Hint3: Remember to partition the dataset into training and test set in order to create the decision tree model and then evaluateit. Feel free to use the partitioning strategy you prefer. Training (Top)Test (Bottom)Accuracy 0.714Cohen's 0.404 DocumentVectorization Enrichment andPreprocessing Table Reader Partitioning DecisionTree Learner Decision Tree View Decision TreePredictor Scorer

Nodes

Extensions

Links