Word2Vec Learner (Tensorflow)

To perform the actual training, hierarchical softmax and negative sampling are both available. The node uses Tensorflow as engine to speed up the pre-processing and to fit the model. Given the presence of a CUDA compatible NVIDIA GPU, training can be performed on the GPU.


Column selection (String type)

Select which document type column you want to use to train the model.

Set seed

Set seeds for the whole node.


Choose the seed number, if you do not want the default one.

Device for Tensorflow model fit

Choose the device where to run the fit for the Word2Vec model; only the visible devices are available. Notice that the indexes next to the device name are just identifiers for the device itself.

Word2Vec parameters

Embedding size

Change the embedding size of the two Word2Vec embedding layers (for target and context words, respectively) in order to get speed (smaller number) or performance (larger number).

Window size (radius)

Choose the radius of the window size that represents how far from the target word Word2Vec looks. The context window always has the target word at the center, and the number that can be set determines the "radius" of the window, meaning that the actual number of context words considered is twice what is inserted.

Number of negative samples

The negative sampling approach is a way to simplify the computational complexity of vanilla Word2Vec while trying to introduce noise in the models in order to regularize it. You can choose the number of negative samples.

Hierarchical Softmax

Activate hierarchical softmax in place of negative sampling. This option thus deactivates negative sampling.

Word2Vec algorithm selection

Choose between CBOW (target as output) and skip-gram (context as output) Word2Vec implementation.

Word Survival Function

Whether to use a word survival function to reduce the size of the vocabulary by prioritizing rarer words.

Sampling rate for Word Survival Function (if flagged)

Set the sampling rate for the Word Survival function, the higher it gets the more words are included in the dictionary. Default value is 10^-3. Max value is 0.1.

Minimum Frequency

Minimum corpus frequency below which a word in the dictionary is not considered. Set it to 0 if filtering according to minimum frequency is not needed.

Training parameters


Number of epochs for model training. The more epoch, the longer time to train, linearly.

Batch size

The batch size you want to set to train the Word2Vec model.

Adam learning rate

Set the learning rate for the Adam optimizer. The actual step in the parameter space is dynamic during training.

Input Ports


A KNIME table with a string column to use for Word2Vec training

Output Ports


A KNIME table with three columns: the index of the token/the word, the token itself and the embedding for the token as a collection (KNIME native list).

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found


This node has no views


  • No workflows found



You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.