Icon

Topic_​Modeling.knar

Topic Models from Reviews

This workflow addresses the problem of extracting and modeling topics from reviews.

Block 1 performs the data preparation on review texts. Block 2 optimizes the parameters for the LDA algorithm. Block 3 applies the LDA algorithm with optimized parameters and displays the LDA topic probabilities along with the average number of stars by topic. Block 4 estimates the importance of topics via linear regression (KNIME) and polynomial regression (R).

If you use this workflow, please cite:
F. Villaroel Ordenes & R. Silipo, “Machine learning for marketing on the KNIME Hub: The development of a live repository for marketing applications”, Journal of Business Research 137(1):393-410, DOI: 10.1016/j.jbusres.2021.08.036.

Block 2 - Find optimal k for topic models: This block finds the optimal k for the LDA topic modeling algorithm. Other methodsfor topic extraction and modeling can be implemented in KNIME (see https://hub.knime.com/angusveitch/spaces/Public/latest/TopicKR~HRMp6v9Ip_ODMIob). Other topic modeling algorithms that can be used in R or Python are structural topic models(STM) and correlated topic models (CTM). Block 1 - Data Preparation for Topic Models: This block performs preprocessing, extraction of n-grams, and exclusionof reviews with a small number of terms. It can be adjusted as desired Block 3 - Obtain topic solution: Users can test more than one solution for topic extraction and choose the best one based on interpretability. The "topicanalysis component" needs to be manually edited to rename topics if changes are made at any earlier stage of the process. Analysis of customer experience feedback with topic models The study of customer experience management (CXM) with big data analytics (BDA) is one of the most relevant marketing analytics topics in the last years. The present workflow shows how managers can identify service aspects with a greater impact oncustomer overall evaluation (star rating). The workflow shows as well how to integrate R for statistical analysis within KNIME Step 3: Select a topic Fit ComparisonSummary words per topicbi-gramsBest numer of topics: 2, 4, 6 used for speedTime the loopConvert to MinutesTopic Extractor(Parallel LDA) Preprocessing Doc Creation Line Plot GroupBy N-grams Step 2: Optimal k Excel Reader Timer Info Math Formula Block 2 - Find optimal k for topic models: This block finds the optimal k for the LDA topic modeling algorithm. Other methodsfor topic extraction and modeling can be implemented in KNIME (see https://hub.knime.com/angusveitch/spaces/Public/latest/TopicKR~HRMp6v9Ip_ODMIob). Other topic modeling algorithms that can be used in R or Python are structural topic models(STM) and correlated topic models (CTM). Block 1 - Data Preparation for Topic Models: This block performs preprocessing, extraction of n-grams, and exclusionof reviews with a small number of terms. It can be adjusted as desired Block 3 - Obtain topic solution: Users can test more than one solution for topic extraction and choose the best one based on interpretability. The "topicanalysis component" needs to be manually edited to rename topics if changes are made at any earlier stage of the process. Analysis of customer experience feedback with topic models The study of customer experience management (CXM) with big data analytics (BDA) is one of the most relevant marketing analytics topics in the last years. The present workflow shows how managers can identify service aspects with a greater impact oncustomer overall evaluation (star rating). The workflow shows as well how to integrate R for statistical analysis within KNIME Step 3: Select a topic Fit ComparisonSummary words per topicbi-gramsBest numer of topics: 2, 4, 6 used for speedTime the loopConvert to MinutesTopic Extractor(Parallel LDA) Preprocessing Doc Creation Line Plot GroupBy N-grams Step 2: Optimal k Excel Reader Timer Info Math Formula

Nodes

Extensions

Links