Icon

CV Skill Keyword Detection with Term Indexer

<p>This use case demonstrates how exorbyte's <strong>M|Box Term Indexing Nodes</strong> can be combined for <strong>keyword-based CV skill detection</strong>.</p><p>Workflow Overview:</p><ol><li><p>The workflow loads</p><ul><li><p>A table containing skill keywords (e.g. "PowerBI", "Python", "pandas")</p></li><li><p>PDFs of applicants' CVs, which are parsed into text fields</p></li></ul></li><li><p>Preprocessing:</p><ul><li><p>The CVs' text is split into paragraphs (based on line breaks)</p></li><li><p>English stopwords are removed</p></li></ul></li><li><p>Indexing &amp; Matching:</p><ul><li><p>The <strong>Term Indexer</strong> builds an index out of the skill keywords.</p></li><li><p>The <strong>Term Index Matcher</strong> efficiently and fault-tolerantly detects the keywords in the CV sections.</p></li></ul></li><li><p>Postprocessing:</p><ul><li><p>The postprocessing returns a table containing the CV-IDs, and the skills that do and do not appear in each document.</p></li></ul></li></ol><p>The <strong>M|Box Term Indexer and Term Index Matcher </strong>nodes enable fast, scalable keyword detection through a deterministic index-based approach, delivering consistent, explainable results without the unpredictability of AI-based matching. This makes it well-suited for compliance-sensitive or auditable pipelines.</p>

URL: exorbyte GmbH https://exorbyte.ai/
URL: exorbyte/KNIME https://exorbyte.ai/knime

Import Keywords & Parse CVs

  • Import a table containing skill keywords (e.g. "Python", "SQL", "Apache Spark")

  • Import and parse applicants' CVs into a string field

Preprocessing

  • Filter out all unneccesary columns from the Tika Parser's output

  • Create CV-IDs

  • Split the content based on line breaks

  • Remove english stopwords

Postprocessing

  • Group the detected keywords by their corresponding CVs

  • Find the keywords that are not included in each CV

  • Return a table with the CV-ID, the included skills and the skills that do not appear.

Request/Activate Exorbyte License

Request and register your exorbyte license before running any M|Box nodes.

If you do not have an active license, within the License Requester:

  1. Choose Demo (30 days) or Production.

  2. Enter your email (and Customer Token if production).

  3. Execute the node – it sends a secure request to the exorbyte team.

  4. When you receive the .lic file, reopen the node → Use available license fileand run the node

Afterwards, or if you already have an active license, run License Activator

⚠️ Each KNIME installation or Hub environment needs its own license

👉 See full exorbyte License Activation Guide

Indexing & Matching

  • Use the Term Indexer to build a fast, fault-tolerant index on the skill keywords.

  • Use the Term Index Matcher to query the index with the CV sections, ignoring leading and trailing characters to detect (partial) keyword matches.

License Activator
Term Indexer
Postprocessing
Create File/Folder Variables
Term Index Matcher
Preprocessing
Skill keywords
CSV Reader
Tika Parser
Path to String (Variable)
License Requester

Nodes

Extensions

Links