This node splits the string content of a selected column into logical groups using regular expressions. A group is identified by a pair of parentheses, whereby the pattern in such parentheses is a regular expression. Each content of each group is appended as an individual column. All appended columns will contain missing values if the input string is not completely matched by the selected regular expression.
A short introduction to Groups and Capturing is given by in the Java API . Some examples are given below:
Patent identifiers such as "US5443036-X21" consisting of
a (at most) two letter country code ("US"), a patent
number ("5443036") and possibly some application code
("X21"), which is separated by a dash or a space
character, can be grouped by the expression
([A-Za-z]{1,2})([0-9]*)[ \-]*(.*$)
.
Each of the parenthesized terms corresponds to the
aforementioned properties.
This is particularly useful when this node is used to
parse the file URL of a file reader node (the URL is
exposed as flow variable and then exported to a table
using a Variable to Table node). The format of such
URLs is similar to "file:c:\some\directory\foo.csv".
Using the pattern
[A-Za-z]*:(.*[/\\])(([^\.]*)\.(.*$))
generates four groups (by counting the number of opening
parentheses): The first group identifies the directory
and is denoted by "(.*[/\\])". It consumes all characters
until a final slash or backslash is encountered; in the example
this refers to "c:\some\directory\". The second group
represents the file name, whereby it encapsulates the
third and fourth group. The third group (denoted by
"([^\.]*)") consumes all characters after the directory,
which are not a dot '.' (which is "foo" in the
above example). The pattern expects a single dot
(which is ignored) and finally the fourth group "(.*$)",
which reads until the end of the string and indicates
the file suffix ('csv'). The groups for the above
example are
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
To use this node in KNIME, install the extension KNIME Base nodes from the below update site following our NodePit Product and Node Installation Guide:
A zipped version of the software site can be downloaded here.
Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter, or chat on Gitter!
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.