Icon

Topic Models from reviews LAPTOP_​victor_​edits.knar

Topic Models from Reviews

This workflow addresses the problem of extracting and modeling topics from reviews.

Block 1 performs the data preparation on review texts. Block 2 optimizes the parameters for the LDA algorithm. Block 3 applies the LDA algorithm with optimized parameters and displays the LDA topic probabilities along with the average number of stars by topic. Block 4 estimates the importance of topics via linear regression (KNIME) and polynomial regression (R).

If you use this workflow, please cite:
F. Villaroel Ordenes & R. Silipo, “Machine learning for marketing on the KNIME Hub: The development of a live repository for marketing applications”, Journal of Business Research 137(1):393-410, DOI: 10.1016/j.jbusres.2021.08.036.

Block 2 - Find optimal k for topic models: This block finds the optimal k for the LDA topic modeling algorithm. Other methodsfor topic extraction and modeling can be implemented in KNIME (see https://hub.knime.com/angusveitch/spaces/Public/latest/TopicKR~HRMp6v9Ip_ODMIob). Other topic modeling algorithms that can be used in R or Python are structural topic models(STM) and correlated topic models (CTM). Block 4 - Analysis to inspect the impact of topics on customer star rating: Analysis can be improved by including topicsentiment, interaction terms, and different modeling alternatives (e.g., ordinal logit regression in R). Block 1 - Data Preparation for Topic Models: This block performs preprocessing, extraction of n-grams, and exclusionof reviews with a small number of terms. It can be adjusted as desired Block 3 - Obtain topic solution: Users can test more than one solution for topic extraction and choose the best one based on interpretability. The "topicanalysis component" needs to be manually edited to rename topics if changes are made at any earlier stage of the process. Analysis of customer experience feedback with topic models The study of customer experience management (CXM) with big data analytics (BDA) is one of the most relevant marketing analytics topics in the last years. The present workflow shows how managers can identify service aspects with a greater impact oncustomer overall evaluation (star rating). The workflow shows as well how to integrate R for statistical analysis within KNIME Step 3: Select a topic Topic ComparisonPer LaptopFit ComparisonOrdinal Logitregression(MASS package)Topics numbers withnew names(topic 0 excluded becausethe was too general)Summary words per topicbi-grams(tri-grams canbe added as well)Narrow down the searchon a smaller range from step 1.Identify elbow point (15)- DV: Star Rating- IV's Topic Probabilities baseline: Topic 0- Control: LaptopStar_Rating Laptops Reviewsavg # stars by topicper LaptopNode 445Topic Extractor(Parallel LDA) Preprocessing Topic Analysis Doc Creation Line Plot Table to R Column Rename GroupBy N-grams Filter Reviews withLess than 10 words Step 2: Optimalk in [14,16] Linear RegressionLearner Number To String Excel Reader Topic Analysis Excel Writer Block 2 - Find optimal k for topic models: This block finds the optimal k for the LDA topic modeling algorithm. Other methodsfor topic extraction and modeling can be implemented in KNIME (see https://hub.knime.com/angusveitch/spaces/Public/latest/TopicKR~HRMp6v9Ip_ODMIob). Other topic modeling algorithms that can be used in R or Python are structural topic models(STM) and correlated topic models (CTM). Block 4 - Analysis to inspect the impact of topics on customer star rating: Analysis can be improved by including topicsentiment, interaction terms, and different modeling alternatives (e.g., ordinal logit regression in R). Block 1 - Data Preparation for Topic Models: This block performs preprocessing, extraction of n-grams, and exclusionof reviews with a small number of terms. It can be adjusted as desired Block 3 - Obtain topic solution: Users can test more than one solution for topic extraction and choose the best one based on interpretability. The "topicanalysis component" needs to be manually edited to rename topics if changes are made at any earlier stage of the process. Analysis of customer experience feedback with topic models The study of customer experience management (CXM) with big data analytics (BDA) is one of the most relevant marketing analytics topics in the last years. The present workflow shows how managers can identify service aspects with a greater impact oncustomer overall evaluation (star rating). The workflow shows as well how to integrate R for statistical analysis within KNIME Step 3: Select a topic Topic ComparisonPer LaptopFit ComparisonOrdinal Logitregression(MASS package)Topics numbers withnew names(topic 0 excluded becausethe was too general)Summary words per topicbi-grams(tri-grams canbe added as well)Narrow down the searchon a smaller range from step 1.Identify elbow point (15)- DV: Star Rating- IV's Topic Probabilities baseline: Topic 0- Control: LaptopStar_Rating Laptops Reviewsavg # stars by topicper LaptopNode 445Topic Extractor(Parallel LDA) Preprocessing Topic Analysis Doc Creation Line Plot Table to R Column Rename GroupBy N-grams Filter Reviews withLess than 10 words Step 2: Optimalk in [14,16] Linear RegressionLearner Number To String Excel Reader Topic Analysis Excel Writer

Nodes

Extensions

Links