Abner Tagger

This node recognizes biomedical named entities, such as genes, proteins or cells and assigns tags to the corresponding terms like "PROTEIN", "RNA", "DNA", "CELL LINE" or "CELL TYPE". Furthermore it can be specified that found named entities are marked as unmodifiable, meaning that they are not going to be modified by any node afterwards. As underlying named entity recognition software ABNER (A Biomedical named entity recognizer) was used. For more details see: http://pages.cs.wisc.edu/~bsettles/abner/.

Options

General options

Document column
The column containing the documents to tag.
Replace column
If checked, the documents of the selected document column will be replaced by the new tagged documents. Otherwise the tagged documents will be appended as new column.
Append column
The name of the new appended column, containing the tagged documents.
Word tokenizer
Select the tokenizer used for word tokenization. Go to Preferences -> KNIME -> Textprocessing to read the description for each tokenizer.

Tagger Options

Set named entities unmodifiable
Sets recognized named entity terms unmodifiable.
ABNER model
Specifies the ABNER model to use. The Biocreative model recognizes proteins only, the NLPBA model recognizes cells, dna and rna lines as well.

Input Ports

Icon
The input table containing the documents to tag.

Output Ports

Icon
An output table containing the tagged documents.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.