Single-Field Indexer (Labs)

The Single-Field Indexer node creates a searchable index from a selected string column of an input data table. By default, the index is based on the TeraEdit object, which enables efficient approximate string matching using edit distance.
The Index Type configuration allows users to choose between different indexing techniques:

  • TeraEdit for single-word edit-distance matching
  • TeraAuto for single-word matching. Fast but less fault-tolerant than TeraEdit.
The generated index can be passed to downstream nodes, such as the Approximate Index Matcher, to quickly retrieve similar values within large datasets. By storing the indexed values in a structured format, the node enables fast and scalable similarity lookups, even in the presence of spelling variations, typos, or inconsistent text data.

Options

Select Column to Index
Selects the string column from the input table that will be used to build the index. Only string-convertible columns are available.
Index Type
Specifies the indexing technique used to build the searchable index. Different index types optimize matching for different data characteristics:
  • TeraEdit – edit-distance calculations and single-word strings.
  • TeraAuto - fast, but less tolerant edit-distance calculations.
Representation of Indexed Strings
tbd

Input Ports

Icon
Mapping
Icon
Table containing the canonical string values to match against.

Output Ports

Icon
Contains the indexed representation of the selected string column. The object includes metadata about the index (e.g., number of rows indexed, unique terms, algorithm parameters) and serves as an input for downstream nodes.

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.