Icon

Index Reader and Writer – Reusing Indexed Data

<p>This workflow demonstrates how to <strong>store and reload</strong> an Exorbyte MatchMaker index in KNIME using the <strong>Index Writer</strong> and <strong>Index Reader</strong> nodes.<br>It shows how indexes created by the <strong>Term Indexer</strong> can be <strong>saved once</strong> and <strong>reused</strong> later for matching, filtering, or search tasks — without rebuilding them every time.</p><p>By combining these nodes, users can efficiently manage their indexing lifecycle:</p><ol><li><p><strong>Build</strong> an index from raw data (e.g., customer names, product codes, or IDs).</p></li><li><p><strong>Persist</strong> the index as a binary file (.bin) using the Index Writer node.</p></li><li><p><strong>Reload</strong> the index later in another workflow using the Index Reader node.</p></li><li><p><strong>Reuse</strong> the loaded index for fuzzy matching or filtering via the <strong>Term Index Matcher</strong>.</p></li></ol><p>This workflow not only accelerates processing for large datasets but also promotes <strong>workflow modularity</strong>, <strong>reproducibility</strong>, and <strong>collaboration</strong> between teams — ensuring consistent fuzzy-matching results across multiple projects.</p><p>⚙️ <strong>Key Highlights</strong></p><ul><li><p>Demonstrates the complete <strong>index lifecycle</strong>: <em>write → read → match</em>.</p></li><li><p>Uses <strong>flow variables</strong> to dynamically link the index path between writer and reader nodes.</p></li><li><p>Ensures <strong>portability</strong> by writing indexes relative to the workflow data area.</p></li><li><p>Ideal for <strong>large-scale</strong> or <strong>repetitive</strong> fuzzy-matching workflows where rebuilding indexes is costly.</p></li></ul>

URL: exorbyte GmbH https://www.exorbyte.com/en

🧩 Save & Reload Index – Reusing Indexed Data

This workflow demonstrates how to store and reload an Exorbyte MatchMaker index using the Index Writer and Index Reader nodes.
It’s useful when you want to reuse a previously built index across workflows without re-indexing the data every time.

Create the Index

  • Use the Single-Field Indexer (Labs) node to build an index from your input data (e.g., customer names, product IDs, etc.).

  • The node outputs an Index Object representing the fuzzy searchable structure.

Save the Index 📝

  • Connect the index output to the Index Writer node.

  • Configure:

    • Write to: Workflow Data Area (recommended for portability).

    • File: choose a file name (e.g., customer_index.bin).

    • Optionally enable Create missing folders and Overwrite existing file.

  • Execute to save the index.

  • The node produces a flow variable exorbyte.index_writer.<name>.index_file containing the absolute path to the saved file.

Load the Index 🚀

  • In a new or the same workflow, add the Index Reader node.

  • Configure:

    • Read from: Workflow Data Area.

    • File: use the same path (or inject it via the flow variable from the Writer node).

    • Optionally enable Validate index to check file integrity.

  • Execute to reconstruct the index in memory.

Use the Index 🔗

  • Connect the Index Reader output to Approximate Index Matcher to perform matching.

⚙️ Steps

💡 Tip

You can share pre-built indexes across workflows or even with team members.
Just copy the .bin file into your workflow’s data folder and point the Index Reader to it — no need to rebuild!

Customers
Excel Reader
Indexing CustomerFull Names
Term Indexer
Storing the Index in the workflow data area
Index Writer
Activate License
License Activator
Reading the Index from the workflow data area
Index Reader
Levenshtein
Term Index Matcher
Query
Table Creator

Nodes

Extensions

Links