02_DocumentVector_Creation

02_Document Vector Creation

Here we transform the collection of documents into numerical vectors. The dataset used in this example is the KNIME Forum Dataset.
After the pre-processing phase, the relative term frequency is computed for each term inside the Transformation component.
The input data set is partitioned into training set and test set.
The term frequencies from the training set are used to build a vector representation of the distinct terms identified by the BoW with a Document Vector node.The same Document Vector transformation is then applied to the Documents in the test set.

Nodes

Component Input5 ×
Component Output5 ×
Bag Of Words Creator2 ×
Row Filter2 ×
TF2 ×
Show all 18 nodes

Extensions

FeatureKNIME Base nodes
FeatureKNIME Textprocessing

02_​DocumentVector_​Creation

Nodes

Extensions

Links

Download

02_DocumentVector_Creation