OpenNLP NE tagger

This Node Is Deprecated — This node is kept for backwards-compatibility, but the usage in new workflows is no longer recommended. The documentation below might contain more information.

This node recognizes named enities based on opennlp models version 1.5.2 and assigns the corresponding tags to them. The following entities are recognized: Persons, Locations, Organizations, Money, Date, and Time. For more details of the OpenNlp natural language processing toolkit, see http://opennlp.apache.org/documentation.html.

Options

Tagger options

Set named entities unmodifiable
Sets recognized named entity terms unmodifiable.
OpenNlp model
Specifies the OpenNLP model to use. Each model is able to recognize a certain type of named entities.
Use additional dictionary file
If checked an additional dictionary file is used by the OpenNlp model to recognize entities.
Dictionary file
The location of the dictionary file. Each named entity have to be written in one line and each word of this entity must be separated by a whitespace, i.e:
Firstname1 Lastname2
Firstname2 Lastname2

General options

Number of maximal parallel tagging processes
Defines the maximal number of parallel threads that are used for tagging. Please note, that for each thread a tagging model will be loaded into memory. If this value is set to a number greater than 1, make sure that enough heap space is available, in order to be able to load the models. If you are not sure how much heap is available for KNIME, leave the number to 1.
Word tokenizer
Select the tokenizer used for word tokenization. Go to Preferences -> KNIME -> Textprocessing to read the description for each tokenizer.

Input Ports

Icon
The input table containing the documents to tag.

Output Ports

Icon
An output table containing the tagged documents.

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.