Joiner

This node combines two tables similar to a join in a database. It combines each row from the top input port with each row from the bottom input port that has identical values in selected columns. Rows that remain unmatched can also be output.

Options

Match
Defines the logic for the matching criteria:
  • All of the following: If selected, joins two rows only when all matching criteria are satisfied
  • Any of the following: If selected, joins two rows when at least one of the matching criteria is satisfied
Join columns
Defines the columns from the top input ('left' table) and the bottom input ('right' table) that should be used for joining. For two rows to be joined, the row from the left input table must have the same value in column A as the row from the right input table in column B. RowIDs can be compared to RowIDs or regular columns, in which case the RowID will be interpreted as a string value.
  • Top input ('left' table): Select the column from the top input table that should be used to compare with the column selected for the bottom input.
  • Bottom input ('right' table): Select the column from the bottom input table that should be used to compare with the column selected for the top input.
Compare values in join columns by
Defines how to compare the values in the join columns:
  • Value and type: Two rows match only if their values in the join columns selected have the same value and type, e.g. Number (integer) values will never match Number (long) values because they have two different types.
  • String representation: Use this option if you want the values to be converted to string before comparing them. In this way you compare only the value in the selected join columns.
  • Make integer types compatible: Use this option to ignore type differences between Number (integer) and Number (long) types.
Matching rows
Include rows that match on the selected column pairs.
Left unmatched rows
Include rows from the left input table for which no matching row in the right input table is found.
Right unmatched rows
Include rows from the right input table for which no matching row in the left input table is found.
Top input ('left' table)
Select columns from top input ('left' table) that should be included or excluded in the output table.
Bottom input ('right' table)
Select columns from bottom input ('right' table) that should be included or excluded in the output table.
Merge join columns
If selected, the join columns of the right input table are merged into their join partners of the left input table. The merged column is named like the left join column if one of its join partners in the right table has the same name. If the join partners have different names, the merged column is named in the form "left column=right column".
If there are duplicate column names
Defines what should happen if there are column names included in the output that have the same name:
  • Append custom suffix: Adds the defined custom suffix to the column name of the right table.
  • Do not execute: Prevents the node to be executed if the columns have the same name.
Custom suffix
The suffix to be added to the column name of the right table
Split join result into multiple tables
Output unmatched rows (if selected under "Include in output") at the second and third output port, i.e.
  • top: Matching rows
  • middle: Left unmatched rows
  • bottom: Right unmatched rows
RowIDs
Defines how the RowIDs of the output table are generated:
  • Concatenate with separator: The RowID of the output table will be made of the RowID of the top input ('left' table) and the RowID of the bottom ('right' table) separated by the defined separator.
  • Create new: The RowIDs of the output table will be assigned sequential RowIDs, e.g. Row0, Row1, etc.
  • Retain: If the matching rows have the same RowIDs in both input tables as a matching criteria the output table will keep the input tables RowIDs.
Separator
The separator to be added in between RowIDs of the input tables to generate the RowIDs of the output table.
Row order
Defines the row order for the output table rows.
  • Arbitrary: The order of the ouput table rows is defined based on the currently available memory. Select this to improve the performance of the node since the output does not have to be sorted. Be aware that it can produce different output orders on consecutive executions.
  • Input order: Rows are output in three blocks:
    1. matched rows
    2. unmatched rows from left table
    3. unmatched rows from right table
    Each block is sorted so that rows are sorted based on their position in the left table. In case of rows with the same position in the left table, the sorting is determined by the row position in the right table.
Maximum number of temporary files
Defines the number of temporary files that can be created during the join operation and possibly subsequent sorting operations. Increase the number of temporary files to improve the performance of the node. Be aware that the operating system might impose a limit on the maximum number of open files.
Hiliting enabled
If selected, hiliting rows in the output will hilite the rows in the left and right input tables that contributed to that row. Equally, when hiliting a row in one of the input tables, all rows that the input row contributed to are hilited. Disabling this option reduces the memory footprint of the joiner, the disk footprint of the workflow, and may speed up the execution in cases where main memory is scarce.

Input Ports

Icon
Left input table
Icon
Right input table

Output Ports

Icon
Either all results or the result of the inner join (if the unmatched rows are output in separate ports)
Icon
Unmatched rows from the left input table (top input port). Inactive if "Output unmatched rows to separate ports" is deactivated.
Icon
Unmatched rows from the right input table (bottom input port). Inactive if "Output unmatched rows to separate ports" is deactivated.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.