Meta Info Extractor

This Node Is Deprecated — This version of the node has been replaced with a new and improved version. The old version is kept for backwards-compatibility, but for all new workflows we suggest to use the version linked below.
Go to Suggested ReplacementMeta Info Extractor

Extracts the meta information key, value pairs of documents. It can be specified whether the documents are appended or not and if duplicate documents in the input table are ignored and meta information is extracted only for distinct documents. Furthermore keys can be specified for which the key value pairs are extracted. Since there may be several key, value pair for each document original row ids can not be kept. For each key, value pair of a document a row in the output table exists. The output table contains at least two columns, one with the keys and another with the values. A third column containing the documents itself is appended if specified in the dialog. Documents that are missing or that do not contain selected keys, or any keys at all will be omitted.

Options

Document column
The document column to use.
Append documents
If checked a column containing the documents is appended.
For distinct documents only
If checked duplicate documents are ignored.
Extract only meta info for specified keys
If checked only meta infos for specified keys are extracted. Otherwise all key value pairs will be extracted.
Meta info keys (comma separated)
The keys of the meta infos to extract. Multiple keys must be comma separated.

Input Ports

Icon
The input table which contains the documents.

Output Ports

Icon
The output table which contains documents and the extracted meta information.

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.