Icon

06 Transformation - solution

Text Mining Course: Preprocessing, Transformation, and Classification Models (solution)

- Compute relative term frequency.
- Create document vectors (bitvectors or numerical with TF values).
- Extract class label / category.

URL: Slides KNIME Analytics Platform Text Mining Course https://www.knime.com/form/material-download-registration

Session 3 - Preprocessing, Transformation, and Classification ModelsSolution 06 - Transformation Computing relative term frequencyCompute relative term frequency with the TF node.Create document vectors (bitvectors or numerical with TF values) using the Document Vector node.Extract class label/category for prediction with the Document Data Extractor node. Reading Textual Data Enrichment Preprocessing Preprocessing II Your Solution Learning objective: In this exercise, you will practice transforming pre-processed text data.Workflow description: This workflow computes relative term frequency, creates document vectors and extract class label/category for prediction.You’ll find the instructions for the exercises in the yellow annotations. Compute relativeterm frequencyExtract categoryfor prediction (class label)Create documentvectorsCreateBag of WordsAssign POS tagsGroupBy termcount documentsRead Tripadvisor dataFilter Bag of WordsKeep only terms thatoccur in at least 5documentsFilter by numberof documentsCreate documentsNo missingsOnly documents TF Document DataExtractor Document Vector Bag Of WordsCreator POS Tagger Case Converter Stop Word Filter Tag Filter Snowball Stemmer Number Filter Punctuation Erasure GroupBy Table Reader Term to String ReferenceRow Filter Row Filter Strings to Document Row Filter Column Filter Session 3 - Preprocessing, Transformation, and Classification ModelsSolution 06 - Transformation Computing relative term frequencyCompute relative term frequency with the TF node.Create document vectors (bitvectors or numerical with TF values) using the Document Vector node.Extract class label/category for prediction with the Document Data Extractor node. Reading Textual Data Enrichment Preprocessing Preprocessing II Your Solution Learning objective: In this exercise, you will practice transforming pre-processed text data.Workflow description: This workflow computes relative term frequency, creates document vectors and extract class label/category for prediction.You’ll find the instructions for the exercises in the yellow annotations. Compute relativeterm frequencyExtract categoryfor prediction (class label)Create documentvectorsCreateBag of WordsAssign POS tagsGroupBy termcount documentsRead Tripadvisor dataFilter Bag of WordsKeep only terms thatoccur in at least 5documentsFilter by numberof documentsCreate documentsNo missingsOnly documentsTF Document DataExtractor Document Vector Bag Of WordsCreator POS Tagger Case Converter Stop Word Filter Tag Filter Snowball Stemmer Number Filter Punctuation Erasure GroupBy Table Reader Term to String ReferenceRow Filter Row Filter Strings to Document Row Filter Column Filter

Nodes

Extensions

Links