0 ×

Python Script 2:2

SchrodingerFreeNodes version 4.1.11.201802140058 by Schrodinger

Executes a Python script which has access to all Schrodinger python libraries, taking 2 input tables and returning 2 output tables.

The node defaults to a script that simply iterates through the input rows and outputs the same data:

iterator = inData[0].iterator()
while iterator.hasNext():
	row = iterator.next()
	outContainer[0].addRowToTable(row)

There are a number of python classes that we have implemented to allow easy access to the input table(s), and easy creation of the output table(s).

BufferedInputTable - The array variable inData with this type is used to reference the input table(s), in the simple case above inData[0] references the only input table.
iterator() - Returns a DataRowIterator that iterates through each row of the input table. getDataTableSpec() - Returns a DataTableSpec class which defines the input table specification (i.e., an instance of the class DataTableSpec)

DataRowIterator - This class is used to iterate through the rows input table. Each row is accessed using the DataRow class.
hasNext() - Returns whether there is another row returned when next() is called.
next() - Returns the next DataRow in the table.

DataRow - This class is used to access all of the cells in a particular row of an input table.
getBufferedDataTable() - Returns the BufferedDataTable associated with this particular row. The table specification can be accessed from this instance, which has the columns and types of this row.
getCell(index) - Returns the cell as the DataCell class with related column type (see below).
getKey() - Returns the key value of the row.
setKey(keyValue) - Sets the key value of the row.

BufferedDataContainer - The array variable outContainer with this type is used to reference and populate the output table(s), in the simple case above, outContainer[0] references the only output table.
addRowToTable(dataRow) - Appends a row to the container. The specification of the output table is set to the types of cells in the added rows. If rows that have different types in the columns are added to the same BufferedDataContainer, then an error occurs. If there are missing cells (e.g., the currently added row has more cells than any previous rows) then the table is filled with missing cells withMissingCell type.

DataTableSpec - The table specification that defines the table's columns and their types. Both types BufferedInputTable and BufferedDataContainer provide a DataTableSpec that have couple ways of looking up columns and types:
allColumns - An ordered list of the DataColumnSpec columns.
columnByNumber - A dictionary of the DataColumnSpec columns that uses the index as the lookup key.
columnIndexByName - A dictionary that uses the column name as the key to lookup the column order.
findColumnIndex(columName) - Returns the column order from the column name.

DataColumnSpec - The column specification which keeps the column name and type.
getName() - Returns the column name.
getType() - Returns the column data value java class. There is also a global dictionary columnTypeToCellType that maps this data value java class to DataCell.

DataCell types

DataCell - This is the base class that represents a cell in a table. In most cases, a cell is represented by a subclass listed below, and only special instances have DataCell type (such as the global instance variable MissingCell which converts to a missing cell in Knime)
setValue(val)- This function sets the value of the cell. Most cells use this member variable self.value to store the value of the cell. Some cells use the variable self.cellFileName to store external file names as the value. (the member variable self.hasFile is set if the file name is used)
getValue() - This function returns the contents of the cell. If the actual value is stored in self.value, it returns self.value. If the contents are stored in the file pointed by self.cellFileName, then it reads the file contents and returns the same.

PdbCell - Cell type that stores PDB molecules. setToStructure(structure) - Sets the value of this cell to a schrodinger.structure.Structure class.
getStructureReader() - Returns a schrodinger.structure.StructureReader instance of this cell.

SdfCell - Cell type that stores Sdf molecules. setToStructure(structure) - Sets the value of this cell to a schrodinger.structure.Structure class.
getStructureReader() - Returns a schrodinger.structure.StructureReader instance of this cell.

Mol2Cell - Cell type that stores Mol2 molecules. setToStructure(structure) - Sets the value of this cell to a schrodinger.structure.Structure class.
getStructureReader() - Returns a schrodinger.structure.StructureReader instance of this cell.

SmilesCell - Cell type that stores a Smiles molecule.

StringCell - Cell type that stores a primitive string value.

IntCell - Cell type that stores a primitive integer value.

DoubleCell - Cell type that stores a primitive double value.

MaestroCell - Cell type that stores Maestro molecules. setToStructure(structure) - Sets the value of this cell to a schrodinger.structure.Structure class.
setToFile(filename) - Sets the cell value to the contents of the file.
getStructureReader() - Returns a schrodinger.structure.StructureReader instance of this cell.

SequenceCell - Cell type that stores Sequence(s).

AlignmentCell - Cell type that stores Alignment(s).

TextFileCell - Cell type that stores text files.
setToFile(filename) - Sets the cell value to the contents of the file.

SurfaceCell - Cell type that stores Surface data. setSurface(surface) - Sets the cell value to surface.
setSurfaceFromFile(filename) - Sets the cell value to the contents of the file.
writeToFile(filename) - Writes the cell value to the given filename.

DataRow types

DataRow - This is the base class that represents a row in a table. A row is represented by a subclass listed below
setKey(rowKey) - Sets the row key
getKey() - Returns the row key
getCell(colindex) - Returns the DataCell at the specified column index
getCellByColumnName(colname) - Returns the DataCell at the specified column name
setCell(index, cell) - Sets the DataCell for the given index

DefaultRow - This class is used to generate rows used for output.
DefaultRow(rowKey, cellList[]) - The constructor takes a row key and cell list.

AppendedColumnRow - This class is used to append cells to an existing row. The row key from the row instance given is used as the row key and the appended cells are added after the cells in the row instance.
AppendedColumnRow(row, cellList[]) - The constructor takes a row and a cell list.

The Python dictionary flowVariables will contain all the flow variables passed to this node as key:value pair, where key is the flow variable name and value is the flow variable value.

Code Examples

Here are some more simple examples that will help understand how to use this Python node:

Adding a 3rd column that is the sum of the first 2 columns:

iterator = inData[0].iterator()
while iterator.hasNext():
    row = iterator.next()
    col0 = row.getCell(0)
    col1 = row.getCell(1)
    newDoubleCell = DoubleCell()
    newDoubleCell.setValue(col0.value + col1.value)
    newRow = AppendedColumnRow(row, [ newDoubleCell ])  # Appending new column to the input columns
    newRow.colNames = [ "Sum of Column 1 and 2"]   # sets the column name
    outContainer[0].addRowToTable(newRow)

Converting SD to Maestro type:

iterator = inData[0].iterator()
while iterator.hasNext():
    row = iterator.next()
    sdc = row.getCell(0)
    for st in sdc.getStructureReader():
        newCell = MaestroCell()
        newCell.setToStructure(st)
        newRow = DefaultRow(row.getKey(), [ newCell ])  # DefaultRow does not include the input columns
        outContainer[0].addRowToTable(newRow)

Ungrouping input Maestro molecules (similar to the Ungroup MAE node):

iterator = inData[0].iterator()
r = 0
while iterator.hasNext():
    row = iterator.next()
    mc = row.getCell(0)
    for st in mc.getStructureReader():
        newMC = MaestroCell()
        newMC.setToStructure(st)
        newRow = DefaultRow("Row%s" % r, [ newMC ])
        outContainer[0].addRowToTable(newRow)
        r = r + 1

Tips:
Environment variables pointing to paths with spaces should be quoted when accessed in the Script section, eg "$SCHRODINGER" on Windows, as it is set to "C:\Program Files" by the default.

Input Ports

This table is accessible in the python script using the variable inData[0] that has the type BufferedDataTable.
This table is accessible in the python script using the variable inData[1] that has the type BufferedDataTable.

Output Ports

This output table is generated after the python script is executed from the variable outContainer[0] which has the type BufferedDataContainer.
This output table is generated after the python script is executed from the variable outContainer[1] which has the type BufferedDataContainer.

Views

Std output/error of Python Script 2:2
Std output/error of Python Script 2:2

Update Site

To use this node in KNIME, install SchrodingerFreeNodes from the following update site:

Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.