Punctuation Erasure

This Node Is Deprecated — This version of the node has been replaced with a new and improved version. The old version is kept for backwards-compatibility, but for all new workflows we suggest to use the version linked below.
Go to Suggested ReplacementPunctuation Erasure

This node allows you to erase punctuation characters of terms. The preprocessed terms are stored in the outgoing DataTable, as well as the documents containing these terms.

Options

Deep preprocessing options

Deep preprocessing
If deep preprocessing is checked, the terms contained inside the documents are preprocessed too, this means that the documents themselves are changed too, which is more time consuming.
Document column
Specifies the column containing the documents to preprocess.
Append unchanged documents
If checked, the documents contained in the specified "Original Document column" are appended unchanged even if deep preprocessing is checked. This helps to keep the original documents in the output data table without the agonizing pain of joining.
Original Document column
Specifies the column containing the original documents which can be attached unchanged.
Ignore unmodifiable tag
If checked unmodifiable terms will be preprocessed too.

Input Ports

Icon
The input table which contains the terms and the documents as a bag of words.

Output Ports

Icon
The output table which contains the preprocessed terms and the documents as a bag of words.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.