Network Component Splitter

This node analyzes a list of relations between nodes for unconnected network components. It expects an input table with two String columns containing (named) nodes of a network. A row represents a connection between the two nodes. The output is a two-column table stating each node and its cluster ID. All nodes with the same cluster ID are transitively connected to each other but to no other node of any other cluster.

Cluster numbering starts at 1 and gaps are avoided, meaning that the maximum cluster ID represents the total number of unconnected network components. The output table is sorted according to ascending cluster IDs. The order of the clusters is unspecified. Note that cluster 1 needs not to be the biggest cluster.

While this functionality can also be implemented via the Network To Row node and its 'Split-up unconnected components' option, our implementation is tuned for performance and large networks. Thus, it does not operate on KNIME's network data type but on an edge definition table with String-typed node columns directly.

Examples for this node's applicability are:
● In production, new products can be assigned to facilities at minimal footprint complexity by keeping distinct material clusters in distinct entities.
● In logistics, hazardous goods can be analyzed for the ability to ship in one delivery.
● In human relations, an organizational chart analysis can reveal data quality issues with employees whose reporting lines do not end at the CEO.

Options

Node Selection

Select Node1 Column
The input table's String-typed column holding the name of the first node that represents an edge in the network.
Select Node2 Column
The input table's String-typed column holding the name of the second node that represents an edge in the network.

Missing Value Handling

Handle Missing Value as Node
If checked, the missing value is treated as a valid node name and will appear as a node in the output table.
If unchecked, missing values are resolved by (1) ignoring edges of both missing values and (2) treating edges between a valid String and a missing value as self-relation of the valid String node. Hence, if unchecked, missing value will not appear in the output table.

Output Column Names

Nodelist Column Name
Name of the output table's first column, containing node names originating from both columns of the input table.
ClusterID Column Name
Name of the output table's second column, assigning a cluster ID to every node name.

Input Ports

Icon
Table which includes at least two String-typed columns. All unique values of the union of these two columns represent the node names of the network. Each row represents an (undirected) edge between the two respective nodes.

Output Ports

Icon
The output table is a unique list of nodes and their assigned cluster IDs.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.