Cell Splitter

This node uses a user-specified delimiter character to split the content of a selected column into parts. It appends either a fixed number of columns to the input table, each carrying one part of the original column, or a single column containing a collection (list or set) of cells with the split output. It can be specified whether the output consists of one or more columns, only one column containing list cells, or only one column containing set cells in which duplicates are removed.
If the column contains more delimiters than needed (leading to more parts than appended columns are available) the additional delimiters are ignored (resulting in the last column containing the unsplit rest of the column).
If the selected column contains too few delimiters (leading to less parts than expected), empty columns will be created in that row.
Based on the delimiters and the resulting parts the collection cells can have different sizes. The content of the new columns will be trimmed if specified (i.e. leading and trailing spaces will be deleted).

Options

Column selection
Select the column whose values are split.
Remove input column
When checked, the selected input column will not be part of the output table.
Delimiter
Specify the delimiter in the value, that splits each part.
Use escape character
If enabled, the backslash ("\") can be used to escape characters, such as "\t" for tabs. You can use the full escape capabilities of Java.
Quotation character
Specify the quotation character, if the different parts in the value are quoted. (The character to escape quotes is always the backslash.) If no quotation character is needed leave it empty.
Remove leading and trailing white space chars (trim)
If checked, leading and trailing white spaces of each part (token) will be deleted.
Output - as list
If selected, the output will consist of one column containing list collection cells in which the split parts are stored. Duplicates can occur in list cells.
Output - as set (remove duplicates)
If selected, the output will consist of one column containing set collection cells in which the split parts are stored. Duplicates are removed and can not occur in set cells.
Output - as new columns
If selected, the output will consist of one or more columns, each containing a split part.
Split input column name
When outputting as new column, check this option when the input column name can be split in the same manner as the column's content to obtain the names for the output columns.
Set Array Size
Check this and specify the number of columns to append. All created columns will be of type String. (See above for what happens if the split produces a different number of parts.)
Guess Size and Column Types
If this is checked, the node performs an additional scan through the entire data table and computes the number of columns needed to hold all parts of the split. In addition it determines the column type of the new columns.
Scan Limit
Maximum number of rows to scan for guessing the number of output columns.
Missing Value Handling
If select, the node creates empty string cell instead of missing value cells.

Input Ports

Icon
Input data table with column containing the cells to split

Output Ports

Icon
Output data table with additional columns.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.