Spacy Tokenizer

This Node Is Deprecated — This version of the node has been replaced with a new and improved version. The old version is kept for backwards-compatibility, but for all new workflows we suggest to use the version linked below.

The node converts a string column with raw text to a Knime Document format, using its own tokenizer based on the selected spaCy model.

Options

Select column: Select a String column that will be tokenized and converted to Document.
Replace column: If checked, the document column will be replaced by the new preprocessed documents. Otherwise the preprocessed documents will be appended as a new column.
Append column: The name of the new appended column, containing the preprocessed documents.

Model

spaCy model: Pick one of the official spaCy models, or refer to a custom model stored in the filesystem. In the latter case refer to a folder with meta.json and config files.

Python

Python

Select one of Python execution environment options:

use default Python environment for the Redfield NLP nodes
use Conda environment

Input Ports

: The input table which contains the documents to preprocess.

Output Ports

: The output table which contains the preprocessed documents.

Views

This node has no views

Workflows

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension Redfield NLP Nodes from the below update site following our NodePit Product and Node Installation Guide:

v5.5

Plugin provider: Redfield AB

Plugin version: 1.2.0.202506120206

On NodePit since: 2025-07-02

Last update: 2025-07-17

Tags: Deprecated

KNIME versions: v5.5, v5.4, v5.3, v5.1, v4.7, v4.6, v4.5, v4.4, v4.3

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!