OpenNLP NE Tagger

This node recognizes named entities based on OpenNLP Name Finder models and assigns the corresponding tags to them. The version of the underlying OpenNLP framework is 1.8.4. The built-in models are pre-trained models from OpenNLP version 1.5.
It is also possible to tag documents with other models than the built-in models. To do so, the model has to be read with the OpenNLP NER Model Reader node.
Note: Models trained with OpenNLP versions lower than 1.5 or higher than 1.8.4 might not work correctly.
The following entities are recognized: Persons, Locations, Organizations, Money, Date, and Time.
For more details of the OpenNLP natural language processing toolkit, click here.

Options

General options

Document column
The column containing the documents to tag.
Replace column
If checked, the documents of the selected document column will be replaced by the new tagged documents. Otherwise the tagged documents will be appended as new column.
Append column
The name of the new appended column, containing the tagged documents.
Word tokenizer
Select the tokenizer used for word tokenization. Go to Preferences -> KNIME -> Textprocessing to read the description for each tokenizer.
Number of maximal parallel tagging processes
Defines the maximal number of parallel threads that are used for tagging. Please note, that for each thread a tagging model will be loaded into memory. If this value is set to a number greater than 1, make sure that enough heap space is available, in order to be able to load the models. If you are not sure how much heap is available for KNIME, leave the number to 1.

Tagger options

Set named entities unmodifiable
Sets recognized named entity terms unmodifiable.
Built-in OpenNLP model
Specifies the OpenNLP model to use. Each model is able to recognize a certain type of named entities.
Use model from input port
If checked, the model from the second input port will be used for tagging.
Tag value
Specifies the named-entity tag to be used for tagging with the model from the input port.

Input Ports

Icon
The input table containing the documents to tag.
Icon
The port object containing the OpenNLP model.

Output Ports

Icon
An output table containing the tagged documents.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.