0 ×

Unique Term Extractor

KNIME Textprocessing Plug-in version 4.3.0.v202011212014 by KNIME AG, Zurich, Switzerland

This node creates a global set of terms over all documents. Optionally, it is possible to filter the top-k words in terms of frequencies. There are three different frequencies to choose from for filtering: the term frequency, the document frequency and the inverse document frequency.

  • Term Frequency (TF): Overall count of a term in all documents.
  • Document Frequency (DF): Number of documents in which a term occurs.
  • Inverse Document Frequency (IDF): The logarithm of the total number of documents divided by the DF.
More information about term frequencies can be found here.

Options

Document column
Select the document column to extract the terms from.
Most frequent terms (k)
Check, if the data table should be restricted on the top k most frequent terms.
Filter terms by
If the 'Most frequent terms (k)' option is checked, the terms are sorted by the selected frequency method (TF, DF or IDF). Only the top-k most frequent terms are then added to the data table.
Append index column
If checked, the node appends an index column containing a unique index for each term. This is especially useful for replacing words with numbers while preparing documents for deep learning.
Append frequency columns
If checked, the node appends a term frequency (TF), document frequency (DF) and inverse document frequency (IDF) column.
Number of threads
The number of threads used to process the documents.

Input Ports

Icon
The input table containing the documents.

Output Ports

Icon
An output table containing a unique term column, frequency columns and an index column.

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME Textprocessing from the following update site:

KNIME 4.3

A zipped version of the software site can be downloaded here.

You don't know what to do with this link? Read our NodePit Product and Node Installation Guide that explains you in detail how to install nodes to your KNIME Analytics Platform.

Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform. Browse NodePit from within KNIME, install nodes with just one click and share your workflows with NodePit Space.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.