SDF Reader

This node reads one or more SDF files and creates several columns with each molecule in a new row. You can select which parts of the molecule should be extracted into columns in the output table. By default only the molecular structure is exported, but in the Property Handling section you can select if and which properties from the SD-file should be extracted into columns of the output table.

Options

Mode
Determine the mode how to select one or multiple files.
  • File: Select a single file.
  • Files in folders: Select a folder and apply filters to select files within it.
Source
The path to the file or folder to select.
Include Subfolders
Whether to include subfolders when selecting multiple files within a folder.
Filter by file extension
Enable filtering files by their extension (e.g. 'xlsx;xlsm').
File extensions
Semicolon-separated list of file extensions to include (e.g. 'xlsx;xlsm;xls'). Case-insensitive unless 'Case sensitive (extensions)' is enabled.
Case sensitive (extensions)
Treat the entered extensions as case sensitive when matching.
Filter by file name
Enable filtering by file name pattern with wildcards or regular expression.
File name filter pattern
Pattern for file name filtering. With type 'Wildcard', use '*' and '?'. With type 'Regex', enter a Java regular expression.
File name filter type
Choose how to interpret the file name pattern.
  • Wildcard: Enable using '*' and '?' as wildcards.
  • Regular Expression: Enable using a Java regular expression.
Case sensitive (names)
Make file name filtering case sensitive.
Include hidden files
Include hidden files in the selection.
Include special files
Include special file types (workflows etc).
Filter by folder name
Enable filtering of folders by name pattern before descending into them.
Folder name pattern
Pattern for folder name filtering. Note that the pattern is applied to the path relative to the specified root folder. Use '*' and '?' with filter type 'Wildcard'. With type 'Regex', enter a Java regular expression.
Folder name filter type
Choose how to interpret the folder name pattern.
  • Wildcard: Enable using '*' and '?' as wildcards.
  • Regular Expression: Enable using a Java regular expression.
Case sensitive (folders)
Make folder name filtering case sensitive.
Include hidden folders
Descend into folders that are hidden (if they otherwise pass filters).
Follow symlinks
Follow symbolic links while traversing folders (only relevant when selecting a folder).
Limit number of read molecules
If enabled, only the specified number of molecules will be read from the input files.
Maximum number of molecules
The maximum number of molecules to read from the input files.
Use molecule name as row ID
Instead of generating row IDs, the molecule's names are taken as row IDs. If names are not unique the node will fail.
Extract molecule name
If selected, the molecules' names are put into a column called 'Molecule name' in the output table. This option can be used together with the previous one.
Add column with source location
Enabling this option will add a column showing the source location for each molecule.
Extract SDF blocks
Extract into a column the complete molecule, starting from the title up to the magic $$$$.
Extract Mol blocks
Extract into a column the molecule's Mol block, i.e. the part starting from the title up to the line before the first property (indicated by a line starting with '>').
Extract Ctab blocks
Extract into a column the molecule's Ctab block, i.e. the part starting from the line after the header up to the line before the first property (indicated by a line starting with '>').
Extract counts
Extract into two columns the molecules' atom and bond counts as they are stored inside the Ctab block.
Extract all properties
During execution scan all source locations for all existing properties and add them to the output table. Please note that this requires two scans over all sources during execution.
Scan for properties
Press this button to perform the action that generates new content for the view.
Cancel
Press this button to cancel the scan that is currently being performed.
Properties
Properties found during scanning are displayed here. Select which properties to extract and optionally change their types. You can only change to a more general type (Integer → Double → String).
  • Extract: Whether to extract this property.
  • Property Name: The name of the property as found in the SDF file.
  • Type: The data type of the property. You can change this to a more general type (Integer → Double → String), but changing to a more specific type may cause errors.
File encoding
Defines the character set used to read or write a file that contains characters in a different encoding. You can choose from a list of character encodings (UTF-8, UTF-16, etc.), or specify any other encoding supported by your Java Virtual Machine (VM). The default value uses the default encoding of the Java VM, which may depend on the locale or the Java property "file.encoding".
  • OS default: Uses the default decoding set by the operating system.
  • ISO-8859-1: ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1.
  • US-ASCII: Seven-bit ASCII, also referred to as US-ASCII.
  • UTF-8: Eight-bit UCS Transformation Format.
  • UTF-16: Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark in the file.
  • UTF-16BE: Sixteen-bit UCS Transformation Format, big-endian byte order.
  • UTF-16LE: Sixteen-bit UCS Transformation Format, little-endian byte order.
  • Other: Enter a valid charset name supported by the Java Virtual Machine.
Custom encoding
A custom character set used to read a CSV file.

Input Ports

Icon
The file system connection.

Output Ports

Icon
Successfully parsed molecules
Icon
Molecules that failed to parse

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.