0 ×

PDB Connector (XML Query String)

DeprecatedVernalis PDB Connector KNIME nodes package version 1.27.2.v202010191232 by Vernalis (R&D), UK

The PDB Connector (XML Query String) node provides connections to two RESTful Web Services for search and retrieval of information from the Protein Data Bank:

  • Advanced Search (http://www.rcsb.org/pdb/rest/search)
  • Custom Report (http://www.rcsb.org/pdb/rest/customReport)

The user interface dialog options are designed to be a close mimic of the interactive reporting options provided at http://www.pdb.org/pdb/search/advSearch.do and the user is encouraged to explore this resource for a full explanation of each search option. The query is entered as a single XML string in this node (For a user-friendly query builder version, see the original PDB Connector node). This query can be copied and pasted from the "Query Details" link on a PDB query results page at the RCSB PDB. The XML Query is made available as a flow variable (xmlQuery) after node execution. Worked examples of using the node using these methods are provided here.

The node also allows for simple generation of a second report format table from a query run in either this or the PDB Connector node, as both nodes provide the query xml as a flow variable at the output ports.

The node provides options to use either POST or GET report webservice variants. The POST option is newer, and should be used unless machine memory is an issue (The node will download the entire report to memory). The GET service requires multiple requests of the webservice, and URL length limits the number of hits which can be processed in each call, and the number of available report fields. Lower values for the maximum URL length (2000-8000) will result in more calls, and fewer fields being available, but is more reliable when running through a proxy server. Higher values should be used where possible. Multiple calls to the GET service are likely to be intercepted by the PDB server "Robot Blocker", adding further time to the query. The GET service should be avoided wherever possible. The node will make a number of retries at increasing delay intervals (0, 1, 5, 10, 30, 60, 300, 600 seconds) to download each block of report data during the second part of the execution (There is an additional delay, defaulting to 1 second, on each attempt, which can be adjusted by adding the line -Dknime.url.timeout= followed by a value in milliseconds - e.g. 5000 for 5 seconds to the knime.ini file).

The PDB Connector (XML Query String) node was developed by Vernalis (Cambridge, UK), based on the original PDB Connector node developed by Enspiral Discovery in collaboration with Vernalis (Cambridge, UK). For feedback and more information, please contact knime@vernalis.com


Query Options

Ligand Image Size
Select the ligand image size to use in the generation of ligand image URLs (applies to Ligand Image field only).
Use POST Query method (Faster)
Select the POST or GET service. GET is older, but may limit number of report fields which can be returned, and is slower. If using the GET option, the maximum URL length can be set
Max. Report GET URL Length
If using the GET option, the maximum URL length can be set between 2000 and 8000. Higher values allow more fields to be returned, and result in fewer calls to the webservice, but may fall foul of proxy servers.
Clear Query
Clear all query options. If a flow variable has been set for the XML Query, this will also be un-set
Test Query
Test the current query, with display of result count. If a flow variable has been specified for the xml query, it's value will be used in the test. (NB This behaviour is different from that in the PDB Connector node, where the 'Test' button uses the query configured in the dialog, without reference to flow variable settings.)
Messages relating to the Test Query button are displayed in this area
XML Query
The PDB XML advanced query text should be entered here. See above for details

Report Options

Select report
Use the dropdown menu to select from a number of predefined standard reports. Select 'Customizable Table' to allow fine-grained selection of all custom report fields, using the individual and group field selectors.
Select All
Initialise a Customizable Table with all report fields.
Clear All
Initialise a Customizable Table with no report fields.

Input Ports

Optional flow variable connection, which could contain the XML query in a flow variable

Output Ports

One-column table of PDB IDs that match query
Custom report fields.

Best Friends (Incoming)

Best Friends (Outgoing)



To use this node in KNIME, install Vernalis KNIME Nodes from the following update site:


A zipped version of the software site can be downloaded here.

You don't know what to do with this link? Read our NodePit Product and Node Installation Guide that explains you in detail how to install nodes to your KNIME Analytics Platform.

Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform. Browse NodePit from within KNIME, install nodes with just one click and share your workflows with NodePit Space.


You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.