Dictionary Replacer

Replaces complete terms contained in the input documents that match with specified dictionary terms with a corresponding specified value. The dictionary is provided by an additional data table at the second data port, consisting of at least two string columns. One string column contains the strings to replace (keys) the other string column contains the replacement strings (values). The columns can be specified in the dialog.


Preprocessing options

Document column
The column containing the documents to preprocess.
Replace documents
If checked, the documents will be replaced by the new preprocessed documents. Otherwise the preprocessed documents will be appended as new column.
Append column
The name of the new appended column, containing the preprocessed documents.
Ignore unmodifiable tag
If checked, unmodifiable terms will be preprocessed too.


Column containing the strings to replace
The column containing the strings (words/terms) to replace (keys).
Column containing the replacement strings
The column containing the replacement strings (values).
Replace words not in the dictionary by
If checked, all words that are not available in the dictionary are replaced by the string entered in the text field.
Note: Entering an empty string or a string consisting solely of whitespaces leads to the removal of all terms not contained in the dictionary.
Word tokenizer
Select the tokenizer used for word tokenization. Go to Preferences -> KNIME -> Textprocessing to read the description for each tokenizer.

Input Ports

The input table which contains the documents to preprocess.
The input table containing at least of two string columns (dictionary).

Output Ports

The output table which contains the preprocessed documents.


This node has no views




You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.