Icon

05_​Word_​Embedding_​Distance

05_Word_Embedding_Distance
Distances on Word Embeddings Here we use word embedding instead of hot encoding, using a Word2Vec Learner node. The hidden layer size is set to 10, therefore producing an embedding with very small dimensionality.Output of the Word2Vec Learner node is a model. Vocabulary Extractor node extracts the words from the model vocabulary and provides their embedding in form of collection. Collection items are isolated using a Split Collectioncolumn node and the distances between word emebedding vectors are calculated.At the end, n selected words are visualized on a scatter plot, to show proximity of same semantic words across different embedding coordinates.The String input node allows to insert one selected word and retrieve all word distances from that word. Smaller distances should correspond to closer words in context or meaning. Scatter Plot of n selected words Extract all distances for a selected word Read articles from PubmedPubmed_Articles.csvcleaningstemmingtag filtering10 hiddenunitsextractword embeddingfor vocabularydistance on word embeddingsdistanceon pairsn selectedwordsonly the nselected wordsby wordkey - wordon keyon wordextract distancesfrom selectedword toother wordsselected word Reading Data Pre-processing Word2Vec Learner VocabularyExtractor Split CollectionColumn Distance MatrixCalculate Distance MatrixPair Extractor Scatter Plot Table Creator ReferenceRow Filter Color Manager Column Rename Row Filter Row Filter Concatenate StringConfiguration Distances on Word Embeddings Here we use word embedding instead of hot encoding, using a Word2Vec Learner node. The hidden layer size is set to 10, therefore producing an embedding with very small dimensionality.Output of the Word2Vec Learner node is a model. Vocabulary Extractor node extracts the words from the model vocabulary and provides their embedding in form of collection. Collection items are isolated using a Split Collectioncolumn node and the distances between word emebedding vectors are calculated.At the end, n selected words are visualized on a scatter plot, to show proximity of same semantic words across different embedding coordinates.The String input node allows to insert one selected word and retrieve all word distances from that word. Smaller distances should correspond to closer words in context or meaning. Scatter Plot of n selected words Extract all distances for a selected word Read articles from PubmedPubmed_Articles.csvcleaningstemmingtag filtering10 hiddenunitsextractword embeddingfor vocabularydistance on word embeddingsdistanceon pairsn selectedwordsonly the nselected wordsby wordkey - wordon keyon wordextract distancesfrom selectedword toother wordsselected word Reading Data Pre-processing Word2Vec Learner VocabularyExtractor Split CollectionColumn Distance MatrixCalculate Distance MatrixPair Extractor Scatter Plot Table Creator ReferenceRow Filter Color Manager Column Rename Row Filter Row Filter Concatenate StringConfiguration

Nodes

Extensions

Links