Extracts relations triplets contained in sentences of a document by investigating relations of tagged named-entities.
The node can be used in two different ways by either checking the Apply preprocessing option or not.
If the option is selected, the node takes care of part-of-speech (POS) and named-entity (NE) tagging as well as lemmatizing.
Stanford CoreNLP standard settings are used in this case. However, tags are not applied to the documents,
since the preprocessing is only applied internally. If the option is unchecked, it is necessary
to provide a column with tagged documents (POS and NE) as well as a column containing lemmatized documents.
Lemmatized documents consist of terms that were converted to their canonical, dictionary or citation form.
Note: Creating the same pipeline by using KNIME's Stanford nodes with default settings
will not necessarily lead to the same results as using the Apply preprocessing option,
since KNIME is using the Penn-Treebank (PTB) tag set. This tag set uses the SYM
tag for
any kind of punctuation and quotation marks. However, Stanford CoreNLP uses a modified version of the
PTB tag set to distinguish these symbols, since they are important for dependency parsing and
natural logic annotation.
The node creates four new columns: two object columns containing named-entities, one column containing the type of relation with the highest confidence and
a column containing the confidence for this relation. The node handles classic named-entities like PERSON
, LOCATION
and
ORGANIZATION
.
Relation types that can be extracted are Live_In
, OrgBased_In
, Located_In
, Work_For
and _NR
.
_NR
specifies no relation between two entities. A detailed explanation of StanfordNLP's approach for relation extraction can be found in this article.
Note: Relation Extraction is a computationally expensive operation. For the usage of this
node it is recommended to run KNIME with at least 4GB of heap space. To increase the heap space, change
the -Xmx setting in the knime.ini file.
This node is based on Stanford CoreNLP 3.9.1.
For more information about StanfordNLP and Relation Extraction, click here.
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
To use this node in KNIME, install the extension KNIME Textprocessing from the below update site following our NodePit Product and Node Installation Guide:
A zipped version of the software site can be downloaded here.
Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.