1 ×

Topic Extractor (Parallel LDA)

KNIME Textprocessing Plug-in version 4.0.0.v201908091514 by KNIME AG, Zurich, Switzerland

Simple parallel threaded implementation of LDA, following Newman, Asuncion, Smyth and Welling, Distributed Algorithms for Topic Models JMLR (2009), with SparseLDA sampling scheme and data structure from Yao, Mimno and McCallum, Efficient Methods for Topic Model Inference on Streaming Document Collections, KDD (2009).

The node uses the "MALLET: A Machine Learning for Language Toolkit." topic modeling library.

Options

Document column
The column that contains the pre-processed document.
Seed
The seed used for random number drawing.
No of topics
The number of topics to detect.
No of words per topic
The number of top words to extract per topic.
No of iterations
Number of iterations to perform (influences the runtime of the algorithm).
Alpha
The alpha parameter defines the Dirichlet prior on the per-document topic distributions. It defines the prior weight of topic k in a document. The library uses the given alpha for all topics. Normally a number less than 1, e.g. 0.1, to prefer sparse topic distributions, i.e. few topics per document.
Beta
The beta parameter defines the prior on per-topic multinomial distribution over words. It defines the prior weight of word w in a topic. The library uses the given beta for all words. Normally a number much less than 1, e.g. 0.001, to strongly prefer sparse word distributions, i.e. few words per topic.
No of threads
Divides the input document collection into the specified number of threads and merges the calculated statistics afterwards.

Input Ports

Data table with the document collection to analyze. Each row contains one document.

Output Ports

The document collection with topic assignments and the probability for each document to belong to a certain topic
The topic models with the terms and their weight per topic
Table with statistics for each iteration

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME Textprocessing Plug-in from the following update site:

KNIME 4.0
Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.