Document Grabber

Downloads document from a certain database which can be specified in the dialog, i.e.: PubMed. After sending the specified query to the database and downloading the resulting documents, the documents will be parsed and deleted if it is specified in the dialog.


The query which is send to the specified database.
Number of results
After a click at the button, the number of results related to the specified query will be shown.
Maximal results
The number of maximal resulting documents to download and parse.
Append query column
If checked a string column is appended, containing the specified query string.
The database to send the query to and receive the resulting documents from, i.e.: PubMed.
Extract meta information if provided by database
If checked, meta information is extracted if provided by database. In case of PubMed the meta information consists of PubMed ID, the chemical list, and the mesh heading list assigned to the article. The meta information is stored as a regular section in the documents, annotated as meta information section.
Documents directory
The directory to save the documents to. The specified directory must exist, be writable and empty.
Delete after parsing
If checked, the files containing the documents will be deleted after parsing.
Document category
The category of the documents.
Document type
The type of the documents.
Word tokenizer
Select the tokenizer used for word tokenization. Go to Preferences -> KNIME -> Textprocessing to read the description for each tokenizer.

Input Ports

This node has no input ports

Output Ports

An output table which contains the parsed document data.


This node has no views




You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.