1 ×

Document Grabber

KNIME Textprocessing Plug-in version 4.0.0.v201908091514 by KNIME AG, Zurich, Switzerland

Downloads document from a certain database which can be specified in the dialog, i.e.: PubMed. After sending the specified query to the database and downloading the resulting documents, the documents will be parsed and deleted if it is specified in the dialog.

Options

Query
The query which is send to the specified database.
Number of results
After a click at the button, the number of results related to the specified query will be shown.
Maximal results
The number of maximal resulting documents to download and parse.
Append query column
If checked a string column is appended, containing the specified query string.
Database
The database to send the query to and receive the resulting documents from, i.e.: PubMed.
Extract meta information if provided by database
If checked, meta information is extracted if provided by database. In case of PubMed the meta information consists of PubMed ID, the chemical list, and the mesh heading list assigned to the article. The meta information is stored as a regular section in the documents, annotated as meta information section.
Documents directory
The directory to save the documents to. The specified directory must exist, be writable and empty.
Delete after parsing
If checked, the files containing the documents will be deleted after parsing.
Document category
The category of the documents.
Document type
The type of the documents.
Word tokenizer
Select the tokenizer used for word tokenization. Go to Preferences -> KNIME -> Textprocessing to read the description for each tokenizer.

Output Ports

An output table which contains the parsed document data.

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME Textprocessing Plug-in from the following update site:

KNIME 4.0
Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.