This node assigns to each term of a document a part of speech (POS)
tag. It is applicable for French, English and German texts. The
underlying tagger models are models of the Stanford NLP group:
For English texts the Penn Treebank tag set is used:
For German texts the STTS tag set is used:
For French texts the French Treebank tag set is used: http://www.llf.cnrs.fr/Gens/Abeille/French-Treebank-fr.php.
Note: the provided tagger models vary in memory consumption and processing speed. Especially the models English bidirectional, German hgc, and Germany dewac require a lot of memory. For the usage of these models it is recommended to run KNIME with at least 2GB of heap space. To increase the head space, change the -Xmx setting in the knime.ini file. If KNIME is running with less than 1.5GB heap space it is recommended to use English left3words, English left3words caseless, or German fast models for tagging of english or german texts.
Descriptions of the models (taken from the website of the Stanford NLP group):
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
A zipped version of the software site can be downloaded here.
Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to email@example.com, follow @NodePit on Twitter, or chat on Gitter!
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.