Icon

Topic_​modeling_​IB_​literature_​analysis

3. Perplexity calculation - with two steps based on Ordenes&Silipo (2021, p. 405) 4. Topic modelingExecute LDA and save the output data as an excel file Topic weight transformation Transformation of topic weight output in order to calculate mean topic probability for the trend analysis and aggregate the topic weight for journal distribution analysis 1. Data input4302 IB papers from Scopuspublished in 2007-2021 2. Text transformation 2.1 Keyword extractionExtract and evaluate keywords for creating an individual stoplistin the Preprocessing step 17 topics Topic keywordsAdd to excel fileCleaned data:4302 papersDocument listSave as excel file Keywords andther frequencybi-grams(tri-grams canbe added as well)Mean topic probabilitySave as excelfileAssessmodel fit (perplexity), in a wide range of topics (2 to 80).Identify elbow range in imageNarrow down the searchon a smaller range from step 1.Identify elbow point (17)Paper equivalent measureSave as excelfileTopic Extractor(Parallel LDA) Excel Writer Excel Reader Keywords extraction Document filter - remove documentswith less than 10 words Excel Writer Preprocessing InteractiveTable (local) N-grams Document creation Excel Writer VisualizePerplexity Step 1: Optimalk in [2,80] PerplexityVisualization Step 2: Optimalk in [14,17] Create mean topicprobability Excel Writer Aggregation oftopic weight 3. Perplexity calculation - with two steps based on Ordenes&Silipo (2021, p. 405) 4. Topic modelingExecute LDA and save the output data as an excel file Topic weight transformation Transformation of topic weight output in order to calculate mean topic probability for the trend analysis and aggregate the topic weight for journal distribution analysis 1. Data input4302 IB papers from Scopuspublished in 2007-2021 2. Text transformation 2.1 Keyword extractionExtract and evaluate keywords for creating an individual stoplistin the Preprocessing step 17 topics Topic keywordsAdd to excel fileCleaned data:4302 papersDocument listSave as excel file Keywords andther frequencybi-grams(tri-grams canbe added as well)Mean topic probabilitySave as excelfileAssessmodel fit (perplexity), in a wide range of topics (2 to 80).Identify elbow range in imageNarrow down the searchon a smaller range from step 1.Identify elbow point (17)Paper equivalent measureSave as excelfileTopic Extractor(Parallel LDA) Excel Writer Excel Reader Keywords extraction Document filter - remove documentswith less than 10 words Excel Writer Preprocessing InteractiveTable (local) N-grams Document creation Excel Writer VisualizePerplexity Step 1: Optimalk in [2,80] PerplexityVisualization Step 2: Optimalk in [14,17] Create mean topicprobability Excel Writer Aggregation oftopic weight

Nodes

Extensions

Links