This node is currently not available in KNIME v5.10 — instead we’re showing this page for KNIME v5.9. You can use the version menu in the title bar to permanently switch your preferred version. This will also show the link to the update site.

Approximate Phrase Matcher (Labs)

The Approximate Phrase Matcher (Labs) node performs approximate phrase-level similarity matching between two text inputs — a Reference Table and a Comparison Table.
It calculates keyword-based similarity between each comparison phrase and the reference phrases based on subword alignment, detecting overlaps, inclusions, and extended forms. The node supports three configurable algorithms that define the logical direction and context of matching:

Subword – Detects partial overlaps or local matches between phrases by analyzing shared character subwords.
Subset – Determines whether the comparison phrase is contained within the reference phrase.
Superset – Determines whether the comparison phrase contains or extends the reference phrase.

This node is ideal for fuzzy matching of sentences, reviews, product names, and entity phrases, allowing downstream filtering, labeling, and duplicate detection tasks.
This node uses exorbyte’s deterministic subword-matching engine to compute fuzzy similarity between multi-word strings.
It performs normalization, subword extraction, and local alignment similar to Levenshtein distance but optimized for phrase structures rather than isolated strings.
By combining subword decomposition with directional matching logic (Subset/Superset), it enables context-aware comparisons useful for entity normalization, phrase clustering, and sentiment pattern detection.
This node may only be used for private and non-commercial purposes. Commercial use requires a valid license from exorbyte GmbH. All rights reserved.
For more information contact consulting@exorbyte.com.

Options

Select Settings Group

Allowing the user to navigate through different sections of the configuration options

Input
Search
Output

Select Column in Reference Input

Select a column of the Reference Input Table to be used as list of Reference Terms

Select Columns in Comparison Input

Select columns applicable to comparison to the Reference Terms

Add Column with Numeric Matching Value

Appends a column showing the calculated similarity score (character count or percentage).

Add Column with Character Match Sequence

Adds a symbolic alignment string visualizing matching and mismatching characters.
'=' -> Match
'x' -> Mismatch
'+' -> Insertion
'/' or '\' -> transition

Add Column with Hit Characters Sequence

Appends a column showing which parts of the reference phrase were matched by the comparison.

Add Column with Best Reference Match

Appends the most similar reference phrase for each comparison row, identifying the best match candidate.

Matching Algorithm Selector

Specifies the algorithm used to calculate phrase similarity between reference and comparison inputs.
Each algorithm defines a specific containment relationship and subword-level comparison strategy.
Options:

Subword - Detects overlapping character subsequences (subwords) between phrases. Suitable for partial or flexible overlap detection.
Subset - Checks if the comparison phrase is contained in the reference phrase.
Superset - Checks if the comparison phrase contains the reference phrase.

Case Sensitivity

Determines whether the matching process should treat uppercase and lowercase characters as distinct.
Options:

Case Sensitive - Maintains exact letter casing during comparison.
Case Insensitive - Normalizes all text to lowercase before matching.

Numeric Matching Value

Defines how similarity between phrases is measured numerically.
Options:

Number of Matching Characters - Returns the total number of identical characters between the comparison and reference phrase.
Similarity in Percent - Returns a normalized percentage value (0–100%) representing relative similarity.

Row Filter Condition

Controls which rows are included in the node output based on the match result.
Options:

Output matching rows - Only outputs rows that meet or exceed the similarity threshold.
Output non-matching rows - Only outputs rows that do not meet the threshold.
No Filtering - Outputs all rows with match metadata for analysis.

Matching Value Threshold - Minimal Number of Matches

This setting allows you to set the filter criteria based on the selection of the Numeric Matching Value.
This setting only appears, if filtering is actually switched on by the previous setting.
If the algorithm specific matching value was chosen, it applies to this number. If similarity was chosen, the value here is also a similarity threshold.

Matching Value Threshold - Minimal Matching Percentage

Input Ports

: Mapping
: Contains canonical or reference phrases to match against.
: Contains phrases to be compared with the reference input.

Output Ports

: Comparison rows enriched with numeric match values, alignment sequences, and the best matching reference phrase.

Popular Predecessors

No recommendations found

Popular Successors

No recommendations found

Views

This node has no views

Workflows

S&P500 Company Lookup with Approximate Phrase MatcherKNIME Hub

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension exorbyte matchmaker toolbox from the below update site following our NodePit Product and Node Installation Guide:

v5.9

A zipped version of the software site can be downloaded here.

Plugin provider: exorbyte GmbH

Plugin version: 1.2.0

On NodePit since: 2026-01-05

Last update: 2026-02-17

Tags: StreamableModern UI

KNIME versions: v5.9, v5.8, v5.5

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!