Cell Splitter

This node uses a user-specified delimiter character to split the content of a selected column into parts. It appends either a fixed number of columns to the input table, each carrying one part of the original column, or a single column containing a collection (list or set) of cells with the split output. It can be specified whether the output consists of one or more columns, only one column containing list cells, or only one column containing set cells in which duplicates are removed.
If the column contains more delimiters than needed (leading to more parts than appended columns are available) the additional delimiters are ignored (resulting in the last column containing the unsplit rest of the column).
If the selected column contains too few delimiters (leading to less parts than expected), empty columns will be created in that row.
Based on the delimiters and the resulting parts the collection cells can have different sizes. The content of the new columns will be trimmed if specified (i.e. leading and trailing spaces will be deleted).

Options

Select a column
Select the column whose values are split.
Remove input column
When checked, the selected input column will not be part of the output table.
Enter a delimiter
Specify the delimiter in the value, that splits each part.
Use \ as escape character
If enabled, the backslash (\) can be used to escape characters, such as \t for tabs. You can use the full escape capabilities of Java.
Enter a quotation character
Specify the quotation character if the different parts in the value are quoted. (The character to escape quotes is always the backslash.) If no quotation character is needed leave it empty.
Remove leading and trailing white space chars (trim)
If checked, leading and trailing white spaces of each part (token) will be deleted.
Output format
Select how the split results should be output: as a list collection, as a set collection (duplicates removed), or as separate columns.
  • As new columns: If selected, the output will consist of one or more columns, each containing a split part.
  • As list: If selected, the output will consist of one column containing list collection cells in which the split parts are stored. Duplicates can occur in list cells.
  • As set (remove duplicates): If selected, the output will consist of one column containing set collection cells in which the split parts are stored. Duplicates are removed and can not occur in set cells.
Split input column name for output column names
When outputting as new columns, check this option when the input column name can be split in the same manner as the column's content to obtain the names for the output columns.
Size determination
Choose whether to specify a fixed number of output columns or to automatically determine the size.
  • Guess size and column types (requires additional data table scan): If this is checked, the node performs an additional scan through the entire data table and computes the number of columns needed to hold all parts of the split. In addition it determines the column type of the new columns.
  • Set array size: Check this and specify the number of columns to append. All created columns will be of type String. (See node description for what happens if the split produces a different number of parts.)
Number of columns
Specify the number of columns to append. All created columns will be of type String.
Scan limit (number of lines to guess on)
Maximum number of rows to scan for guessing the number of output columns.
Maximum rows to scan
Maximum number of rows to scan for guessing the number of output columns and their types.
Create empty string cells
If checked, empty string cells are created for missing or short input cells instead of missing value cells.

Input Ports

Icon
Input data table with column containing the cells to split

Output Ports

Icon
Output data table with additional columns.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.