0 ×

Document Data Assigner

StreamableKNIME Textprocessing Plug-in version 4.0.0.v201906171531 by KNIME AG, Zurich, Switzerland

The Document Data Assigner adds meta information like authors, source, category, type and publication date to input documents. What meta information will be assigned from which columns can be specified in the node dialog. The new document column can be appended or replaces the old one. This node is streamable.

Options

Use author(s) from column
If checked, the string values of the specified column will be used as author(s).
Author name separator
The string separating the author names while processing the authors column. The authors column has to be formatted like "John DoeSEPERATORJennifer Doe". At first, the authors will be seperated at the defined character(s), then each name will be split at the whitespace between first name and last name. Care: Only the first and the last name will be assigned to the document, so second names will be dropped (e.g. 'John Franklin Doe' will be handled as 'John Doe').
Document source
The source which is set to all documents (if "Use sources from column" is unchecked).
Use sources from column
If checked, the string values of the specified column will be used as document sources.
Document source column
The column containing the string used as source. No source is set for missing values.
Document category
The category which is set to all documents (if "Use categories from column" unchecked).
Use categories from column
If checked, the string values of the specified column will be used as document categories.
Document category column
The column containing the string used as category. No category is set for missing values.
Document type
The type which is set to all documents.
Publication date
The publication date which is set to all documents (if "Use publication date from column" is not checked).
Use publication date from column
If checked, the Date value of the specified column will be used as document publication date.
Publication date column
The date column containing the publication date. (if "Use publication date from column" is checked, otherwise the current date from "Date" field is set as date).
Replace document column
If checked, the incoming document column will be replaced by the processed document column. Otherwise, the new document column will be appended.
Append document column
If checked, a new document column containing the processed documents will be appended to the existing table.
Appended document column name
The name of the appended document column.

Input Ports

The input table which contains documents to process and optionally string columns containing the meta information.

Output Ports

The output table which contains the processed document column.

Best Friends (Incoming)

Best Friends (Outgoing)

Installation

To use this node in KNIME, install KNIME Textprocessing Plug-in from the following update site:

KNIME 4.0
Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.