0 ×

Joiner (Labs)

KNIME Base Nodes version 4.3.2.v202103021015 by KNIME AG, Zurich, Switzerland

This node combines two tables similar to a join in a database. It combines each row from the top input port with each row from the bottom input port that has identical values in selected columns. Rows that remain unmatched can also be output.

Options

Joiner settings

Join Columns
Select the columns from the top input ('left' table) and the bottom input ('right' table) that should be used for joining. Each pair of columns defines an equality constraint of the form A = B. For two rows to be joined, the row from the left input table must have the same value in column A as the row from the right input table in column B. Row keys can be compared to row keys or regular columns, in which case the row key will be interpreted as a string value.
Include in output
Controls which rows are included in the join result. The join mode corresponding to the selected settings, e.g., inner join, outer join, or antijoin, is displayed for convenience.
Matching rows: If selected, the joined rows are included in the output. Deactivating this can be used to find only the rows that do not have a join partner in the other table.
Left unmatched rows: Whether to include the rows from the left table for which no row in the right table has the same values in the selected join columns. For example, including only matches and left unmatched rows corresponds to a left outer join in database terms.
Right unmatched rows: Whether to include unmatched rows from the right input table. For example, including only right unmatched rows corresponds to a right antijoin in database terms.
Output options
Controls the format in which the join result is output and whether hiliting is enabled.
Output unmatched rows to separate ports: If selected, the node will produce three output tables instead of one. The top output port contains the combined rows, the middle output port contains the unmatched rows from the left input table (if they are included in the "Include in output" settings), and the bottom port contains the unmatched rows from the right input table.
Merge join columns: If active, the join columns of the right input table are merged into their join partners of the left input table. The merged column is named like the left join column if one of its join partners in the right table has the same name. If the join partners have different names, the merged column is named in the form left column=right column.
For instance, when joining a table with columns A, B, and C as left input table with a table that has columns X, A, and Z using the join predicates A=A, A=X, and C=Z, the resulting output table would have columns A, B, C=Z. Note how the column A in the output table contains the value of the column A in the left table, which is also the value of the column X in the right table, as required by the join conditions A=X.
The value of a merged join column for an unmatched row is taken from whichever row has values. For instance, when outputting an unmatched row from the right table in the above example with values x, a, and z, the resulting row in format A, B, C=Z has values x, ?, z.
When merge join columns is off, the row is instead output as ?, ?, ?, x, a, z.
Hiliting enabled: If selected, hiliting rows in the output will hilite the rows in the left and right input tables that contributed to that row. Equally, when hiliting a row in one of the input tables, all rows that the input row contributed to are hilited.
Disabling this option reduces the memory footprint of the joiner, the disk footprint of the workflow, and may speed up the execution in cases where main memory is scarce.
Row Keys
How to generate the keys of the combined output rows.
Concatenate original row keys with separator: The generated row key concatenates the row keys of the input rows, e.g., a row joining rows with key Row3 and Row17 is assigned the key Row3_Row17.
Assign new row keys sequentially: Combined rows are assigned row keys in the order they are produced, e.g., the first row in the join result is assigned row key Row0, the second row is assigned key Row1, etc.

Column selection

Column Selection (Top Input (left table) and Bottom Input (right table))
Include: Select columns that are included in the combined rows.
Exclude: Select columns that are discarded in the combined rows.
Duplicate column names
The option configures how column names clashes are handled, in case a column name appears both in the left input table and the right input table.
Do not execute: The node will display a warning and won't execute if there are duplicate column names in the left and right input tables.
Append suffix: Append the specified suffix to the duplicate column names in the right input table.

Performance

Output order
Arbitrary output order: The execution time of the join can be improved if the output does not have to be sorted. The produced order can vary between any two executions of the joiner and depends on how much main memory is available during the join operation.
Sort by row offset in left table, then right table: The combined rows in the output are sorted according to the offsets of the contributing rows. Consider a row R=(L1, R1) in the output that combines rows L1 and R1 from the left and right input tables, respectively. R comes before another row S=(L2, R2) in the output if L1 comes before L2 in the left table. If two rows in the output both involve the same row from the left table, they are ordered according to the order of the right contributing row.
If the results are output in a single port, the matching rows are first output, then the unmatched rows from the left table, and finally the unmatched rows from the right table.
This output order is the same as in the previous joiner implementation.
Miscellaneous
Maximum number of open files: Controls the number of temporary files that can be created during the join operation and possibly subsequent sorting operations. More temporary files may increase performance, but the operating system might impose a limit on the maximum number of open files.

Input Ports

Icon
Left input table
Icon
Right input table

Output Ports

Icon
Either all results or the result of the inner join (if the unmatched rows are output in separate ports)
Icon
Unmatched rows from the left input table (top input port). Inactive if "Output unmatched rows to separate ports" is deactivated.
Icon
Unmatched rows from the right input table (bottom input port). Inactive if "Output unmatched rows to separate ports" is deactivated.

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME Base nodes from the following update site:

KNIME 4.3

A zipped version of the software site can be downloaded here.

You don't know what to do with this link? Read our NodePit Product and Node Installation Guide that explains you in detail how to install nodes to your KNIME Analytics Platform.

Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform. Browse NodePit from within KNIME, install nodes with just one click and share your workflows with NodePit Space.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.