Icon

JKISeason2-06_​rev00

There has been no description set for this workflow's metadata.

You work for a Marketing agency that monitors the online presence of a few airline companies tounderstand how they are being reviewed. You were asked to identify whether a tweet mentioning anairline is positive, neutral, or negative, and decided to implement a simple sentiment analysisclassifier for this task. What accuracy can you get when automating this process? Is the classifierlikely to help company reviewers save their time? Note: Given the size of the dataset, training theclassifier may take a little while to execute on your machine (especially if you use moresophisticated methods. Feel free to use only a part of the dataset in this challenge if you want tospeed up your solution. Hint 1: Check our Textprocessing extension to learn more about how youcan turn tweets' words into features that a classifier can explore. Hint 2: Study, use, and/or adaptshared components Enrichment and Preprocessing and Document Vectorization (in this order!) ifyou want to get a part of the work done more quickly. They were created especially for thischallenge. Hint 3: Remember to partition the dataset into training and test set in order to create thedecision tree model and then evaluate it. Feel free to use the partitioning strategy you prefer. Node 8 Enrichment andPreprocessing DocumentVectorization Table Reader Output You work for a Marketing agency that monitors the online presence of a few airline companies tounderstand how they are being reviewed. You were asked to identify whether a tweet mentioning anairline is positive, neutral, or negative, and decided to implement a simple sentiment analysisclassifier for this task. What accuracy can you get when automating this process? Is the classifierlikely to help company reviewers save their time? Note: Given the size of the dataset, training theclassifier may take a little while to execute on your machine (especially if you use moresophisticated methods. Feel free to use only a part of the dataset in this challenge if you want tospeed up your solution. Hint 1: Check our Textprocessing extension to learn more about how youcan turn tweets' words into features that a classifier can explore. Hint 2: Study, use, and/or adaptshared components Enrichment and Preprocessing and Document Vectorization (in this order!) ifyou want to get a part of the work done more quickly. They were created especially for thischallenge. Hint 3: Remember to partition the dataset into training and test set in order to create thedecision tree model and then evaluate it. Feel free to use the partitioning strategy you prefer. Node 8 Enrichment andPreprocessing DocumentVectorization Table Reader Output

Nodes

Extensions

Links