Icon

06 Document Vector

06 Document Vector
Exercise: Creating and applying a document vectorIn this exercise you'll build a document vector and then apply it to a new document.1) Execute the Preprocessed documents metanode. It preprocesses the agendas of two instructor-led KNIMEcourses (L4-TP Introduction to Text Processing and L4-TS Introduction to Time Series Analysis). 2) Transform the preprocessed documents into a bag of words. 2) Calculate the document frequencies in the bag of words3) Filter the data by words that occur in both documents4) Transform the data into a document vector. Use Bitvector.5) Execute the Preprocessed new document metanode. This document contains the agenda text of the L4-BDIntroduction to Big Data with KNIME Analytics Platform course. 6) Transform the new document into a bag of words7) Apply the document vector model to the new document. Select Use settings from model. Which words occur in allthree agendas? documentfrequencyBitvectorCommon words in allagendasWordsin both documents DF Bag Of WordsCreator Document Vector DocumentVector Applier Row Filter Bag Of WordsCreator Preprocesseddocuments Preprocessednew document Exercise: Creating and applying a document vectorIn this exercise you'll build a document vector and then apply it to a new document.1) Execute the Preprocessed documents metanode. It preprocesses the agendas of two instructor-led KNIMEcourses (L4-TP Introduction to Text Processing and L4-TS Introduction to Time Series Analysis). 2) Transform the preprocessed documents into a bag of words. 2) Calculate the document frequencies in the bag of words3) Filter the data by words that occur in both documents4) Transform the data into a document vector. Use Bitvector.5) Execute the Preprocessed new document metanode. This document contains the agenda text of the L4-BDIntroduction to Big Data with KNIME Analytics Platform course. 6) Transform the new document into a bag of words7) Apply the document vector model to the new document. Select Use settings from model. Which words occur in allthree agendas? documentfrequencyBitvectorCommon words in allagendasWordsin both documentsDF Bag Of WordsCreator Document Vector DocumentVector Applier Row Filter Bag Of WordsCreator Preprocesseddocuments Preprocessednew document

Nodes

Extensions

Links