Dictionary Replacer (File-based)

Replaces terms contained in the input documents that match with specified dictionary terms by the corresponding specified value. The dictionary file need to be specified. Each line of the dictionary file must contain a key and a value. If the key matches with a term, the term is replaced by the value. The key and value pairs must be separated by ",".

Options

Preprocessing options

Document column
The column containing the documents to preprocess.
Replace documents
If checked, the documents will be replaced by the new preprocessed documents. Otherwise the preprocessed documents will be appended as new column.
Append column
The name of the new appended column, containing the preprocessed documents.
Ignore unmodifiable tag
If checked, unmodifiable terms will be preprocessed too.

Dictionary options

Dictionary file
The path of the dictionary file to use.
Replace words not in the dictionary by
If checked, all words that are not available in the dictionary are replaced by the string entered in the text field.
Note: Entering an empty string or a string consisting solely of whitespaces leads to the removal of all terms not contained in the dictionary.
Word tokenizer
Select the tokenizer used for word tokenization. Go to Preferences -> KNIME -> Textprocessing to read the description for each tokenizer.

Input Ports

Icon
The input table which contains the documents to preprocess.

Output Ports

Icon
The output table which contains the preprocessed documents.

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.