Icon

Thesis 1

Text mining

Db preparation

1. Data Preparation for Topic Models. Preprocessing, n-grams, exclusion of reviews with a small number of terms can be adjusted as desired

3. Obtain topic solution. Users can test more than 1 topic solution and choose based on interpretability.

2. Find optimal k. Other methods can be implemented in KNIME (https://hub.knime.com/angusveitch/spaces/Public/latest/TopicKR~HRMp6v9Ip_ODMIob). Other Topic model algorithms that can be used in R or python are structural topic models (STM) and correlated topic models (CTM).
IEEE explorer
Excel Reader
Scopus
Excel Reader
Row Filter
Row Filter
Column Filter
Column Filter
Column Appender
Column Filter
Step 3: Select a topic #example K=4
Topic Extractor (Parallel LDA)
In a range of topicsidentify elbow range in image
CHI-Square
Math Formula
Joiner
Row Filter
Column Renamer
Excel Reader
Concatenate
Duplicate Row Filter
Column Filter
3. Token filter
String Replacer (Dictionary)
Excel Reader
In a range of topicsidentify elbow range in image
Perplexity index
Summary words per topic
GroupBy
Papers per topic
GroupBy
1. Doc Creation
Column Resorter
2. Preprocessing
Column Renamer
Excel Writer
Column Renamer
Concatenate
Column Resorter
Excel Reader
bi-grams (tri-grams can be added as well)
4. N-grams
Column Appender
Duplicate Row Filter
replace withtopic names
Table Creator
Python Script
5. Filter Reviews with Less than 10 words

Nodes

Extensions

Links