Load SD-Files (SDF)

Each sd-file is loaded and parsed such that one output row of the table contains the data for 1 compound in the sd-file. The node attempts to parse the header block, subject to a number of caveats. Firstly, not all SDF writers produce correctly formatted MOL block headers, in particular in the second line. Incorrectly formatted lines will be parsed as if they were correctely formatted, which might result in spurious contents in some columns. The parsing is permissive, so the node will not fail to execute with poor formatting. Secondly, V3000 mol and sd-files use '0' counts in the header counts line. It is these counts which are reported, not the new 'M V30 COUNTS ...' line

This node was developed by Vernalis Research . For feedback and more information, please contact knime@vernalis.com

Options

Select files
Use the 'Browse...' and 'Add from history' buttons to add all the files to be included in the table. Alternatively, a flow variable can be specified, containing one or more filenames separated by ';'. The latest added file(s) will be selected. If no files are highlighted in the 'Selected files' box, then the 'Browse...' button opens a new file browser window in the default location; otherwise, the file browser opens in the last highlighted file's location.
Select file encoding
Select the file encoding. 'Guess' will attempt to assign it based on the connection property of the URL, the content-type, and the Byte-Order Mark (BOM). UTF-8 will be used if no other encoding is identified
Include paths in output table
Include the full file path and URLs as columns in the output table
Include filename in Row IDs
The filename will be included in the Row ID (duplicated will be suffixed with '_n', where n is an index starting at 0). Otherwise, the Row IDs will be in the format 'Row_n', with an an index starting at 0
Include filenames in output table
Include the filename as a column in the output table
Newline output
The newline character(s) to be used in the SDF Cell. 'System' will dynamically use the the newline of the system the node is executed on (the current value for this is shown in the dialog, but on another system, the local value will be used). 'Preserve incoming' will look in the first 65535 characters of the file for the first linebreak ('\r\n' or '\n') and use that.

Input Ports

Icon
Optional flow variables containing file path(s)

Output Ports

Icon
Parsed content of the loaded files

Popular Successors

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.