Icon

Term Indexing and Matching Overview Examples

<p><strong>Term Fuzzy Matching &amp; Index Management</strong></p><p>This workflow demonstrates how to perform <strong>string Term index matching, normalization, and index management</strong> using the exorbyte M|BOX nodes in KNIME.<br></p><p>It showcases key steps for enabling <strong>fault-tolerant name and text matching</strong> through configurable indexing, normalization, and license activation.</p><p>🧩 Included Examples</p><ul><li><p><strong>License Setup:</strong> How to request, activate, and manage MatchMaker licenses.</p></li><li><p><strong>Quickstart:</strong> Detecting exact and single-edit typos using Levenshtein similarity.</p></li><li><p><strong>Character Normalization:</strong> Improving accuracy with accent and diacritic mapping.</p></li><li><p><strong>Index Writing &amp; Reading:</strong> Persisting and reusing index data across workflows.</p></li><li><p><strong>Name Variant Matching:</strong> Linking variant spellings (e.g., <em>Mathias ↔ Matthias</em>) through indexed fuzzy matching.</p></li></ul><p>✉️ Contact us: <strong>consulting@exorbyte.com</strong><br>🌐 <strong>exorbyte.ai</strong></p>

URL: exorbyte GmbH https://www.exorbyte.ai

🧭 Quickstart: Exact & Single-Edit Typos

This section demonstrates how the Term Index Matcher finds exact matches and small typos using a simple Levenshtein comparison.

  • Levenshtein distance handles insertions, deletions, and substitutions.

  • Perfect for testing single-edit errors or near-identical names.

  • Works best on short, clean strings like first names.

🔤 Character Normalization

This example shows how Character Mapper normalization improves match accuracy when dealing with diacritics or special characters.

  • Maps accented letters (Á→A, ö→oe, é→e).

  • Ensures consistent indexing and querying across scripts.

  • Ideal for multilingual or international datasets.

Reference Names:

  • MÁRÍA

  • Jörg

  • François

💾 Index Writing

🔐 How to Get Your License

Use this node to request and register your exorbyte matchmaker license before running any toolbox nodes.

  1. Choose Demo (30 days) or Production.

  2. Enter your email (and Customer Token if production).

  3. Execute the node — it sends a secure request to exorbyte team.

  4. If offline, manually email the request file toknime-node-license@exorbyte.com.

  5. When you receive the .lic file, reopen the node → Use available license fileand run the node → run License Activator.

⚠️ Each KNIME installation or Hub environment needs its own license.

👉 See full workflow guide: How to license exorbyte Extension

🅰 Name Variant Matching

This section demonstrates how the Term Index Matcher can detect name variants and associate them with their canonical forms using Levenshtein similarity.

  • Index the base names using Term Indexer.

  • Match the variants against the indexed base names with the Term Index Matcher using the Levenshtein algorithm.

📥 Index Reading

License Requester
License Activator
Comparison (Query table)
CSV Reader
Reference
CSV Reader
Term Indexer
Levenshtein Algorithm
Term Index Matcher
Comparison (Query table)
CSV Reader
Dataset
CSV Reader
Term Indexer
Term Indexer
Ungroup the results
Ungroup
Reference
CSV Reader
Levenshtein Algorithm
Term Index Matcher
Levenshtein Algorithm
Term Index Matcher
Read the index fromworkflow data area
Index Reader
Character Mapper
Store the index in workflow data area
Index Writer
Comparison (Query table)
CSV Reader
Term Indexer
Levenshtein Algorithm
Term Index Matcher
Base Names
Table Creator
Variants
Table Creator

Nodes

  • CSV Reader6 ×
  • Approximate Index Matcher (Labs)5 ×
  • Single-Field Indexer (Labs)4 ×
  • Table Creator2 ×
  • Character Mapper (Labs)1 ×
  • Index Reader (Labs)1 ×
  • Index Writer (Labs)1 ×
  • License Activator (Labs)1 ×
  • License Requester (Labs)1 ×
  • Ungroup1 ×

Extensions

Links