Icon

06 Document Vector

Exercise: Creating and applying a document vectorIn this exercise you'll build a document vector and then apply it to a new document.1) Execute the Preprocessed documents metanode. It preprocesses the agendas of two instructor-led KNIMEcourses (L4-TP Introduction to Text Processing and L4-TS Introduction to Time Series Analysis). 2) Transform the preprocessed documents into a bag of words. 2) Calculate the document frequencies in the bag of words3) Filter the data by words that occur in both documents4) Transform the data into a document vector. Use Bitvector.5) Execute the Preprocessed new document metanode. This document contains the agenda text of the L4-BDIntroduction to Big Data with KNIME Analytics Platform course. 6) Transform the new document into a bag of words7) Apply the document vector model to the new document. Select Use settings from model. Which words occur in allthree agendas? documentfrequencyBitvectorCommon words in allagendasOnly wordsin both documents DF Bag Of WordsCreator Document Vector DocumentVector Applier Row Filter Bag Of WordsCreator Preprocesseddocuments Preprocessednew document Exercise: Creating and applying a document vectorIn this exercise you'll build a document vector and then apply it to a new document.1) Execute the Preprocessed documents metanode. It preprocesses the agendas of two instructor-led KNIMEcourses (L4-TP Introduction to Text Processing and L4-TS Introduction to Time Series Analysis). 2) Transform the preprocessed documents into a bag of words. 2) Calculate the document frequencies in the bag of words3) Filter the data by words that occur in both documents4) Transform the data into a document vector. Use Bitvector.5) Execute the Preprocessed new document metanode. This document contains the agenda text of the L4-BDIntroduction to Big Data with KNIME Analytics Platform course. 6) Transform the new document into a bag of words7) Apply the document vector model to the new document. Select Use settings from model. Which words occur in allthree agendas? documentfrequencyBitvectorCommon words in allagendasOnly wordsin both documentsDF Bag Of WordsCreator Document Vector DocumentVector Applier Row Filter Bag Of WordsCreator Preprocesseddocuments Preprocessednew document

Nodes

Extensions

Links