Phrase Indexer (Labs)

The Phrase Indexer node creates a searchable index from a selected string column that contains multi-word phrases.
Before indexing, each phrase is split into individual tokens based on a user-defined delimiter (default: blank space). Each token is then indexed, enabling efficient approximate string matching over multi-word data.
This node is particularly useful for text fields containing full names, product descriptions, or address lines, where indexing based on words rather than entire strings improves recall and match flexibility.
The generated index can be passed to downstream node Approximate Phrase Index Matcher, enabling rapid retrieval of similar phrases or partial matches from large datasets.

Options

Select Column to Index
Selects the string column from the input table that contains the phrases to be indexed. Only string-convertible columns are available.
Delimiter
Specifies the character or pattern used to split each phrase into words before indexing.
By default, phrases are split by a blank space (" ").
Example delimiters: comma (,), semicolon (;), or custom token separators.
Proper delimiter selection helps ensure correct tokenization and optimal match performance.
Representation of Indexed Strings
Determines how indexed values are represented in the output table of downstream matching nodes.
Options:
  • Original - Displays strings as they appear in the input column.
  • Normalized - Displays transformed versions to improve fuzzy-matching precision.

Input Ports

Icon
Mapping
Icon
Aliases
Icon
Table containing the text or phrase column to be indexed.

Output Ports

Icon
Contains the indexed representation of the tokenized phrases. The object includes metadata about the index (e.g., number of phrases, tokens, unique terms, algorithm parameters) and serves as input for downstream nodes.

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.