Icon

Text_​mining_​in_​life_​sciences_​literature

Text mining techniques in life sciences literature

The workflow allows for searching for life sciences information contained in indexed literature in European PMC.
Different nodes are used to obtain the titles and abstracts of the publications, their authors, and the journals that published the literature, offering analyses as tables or plots.
Automatic clustering of topics in abstracts is used to organize data.
Additionally, an example is given to cross information with external data.

To detect FDA drugs mentioned in the titleof the publications. To detect the journals that publishedabout the topic and to count the number ofpublications in each journal. To detect and plot the number ofpublications per year. To detec the authors that signed thepublications and to count the number ofpublications of each one. Automatic clustering of topics usingabstracts as corpus. Bag of word with terms used in abstracts. Text mining techniques in life sciences literatureThe following workflow allows for searching for life sciences information contained in indexed literature in European PMC.The European PubMed Central Advanced Search node is used to define the search. An example is offered to obtain a bibliography that mentions KNIME.Different nodes are used to obtain the titles and abstracts of the publications, their authors, and the journals that published the literature, offering analyses as tables or plots.Automatic clustering of topics in abstracts is used to organize data.Additionally, an example is given to cross information with external data (in the example, the list of approved U.S. Food and Drug Administration drugs).To facilitate the comprehension of the workflow, different filters are applied in different paths in the Column Filter node. However, to optimize computer time, some of them could have been unified.Additionally, an example to cross information with external data (in the example the list of aproved U. S. Food and Drug Administration drugs) is given.To facilitate the comprehension of the workflow different filters were applied in differents paths. However to optimize computer time some of them could have been unified. Extraction of items withinformation of interestYearsPublication/yearTerms andtopicsJournalsDefine the searchAbstractsand publication yearFDA DrugsCheck machesDrugs in titleTitlesExample: select 2022abstractsPublications/authorAuthor StingXPath Column Filter Value Counter GroupBy Abstractpre-processing #2 Bag Of WordsCreator Value Counter Topic Extractor(Parallel LDA) RowID RowID Sorter Scatter Plot Table Indexer Column Filter Value Counter RowID Sorter European PubMed CentralAdvanced Search Column Filter Table Creator Case Converter Cross Joiner String Manipulation Row Filter Value Counter Column Filter Abstractpre-processing #1 Case Converter Sorter Index Query Missing Value Tag Cloud Color Manager String Replacer Sorter Detectionof authors Value Counter RowID Column Filter Missing Value Missing Value To detect FDA drugs mentioned in the titleof the publications. To detect the journals that publishedabout the topic and to count the number ofpublications in each journal. To detect and plot the number ofpublications per year. To detec the authors that signed thepublications and to count the number ofpublications of each one. Automatic clustering of topics usingabstracts as corpus. Bag of word with terms used in abstracts. Text mining techniques in life sciences literatureThe following workflow allows for searching for life sciences information contained in indexed literature in European PMC.The European PubMed Central Advanced Search node is used to define the search. An example is offered to obtain a bibliography that mentions KNIME.Different nodes are used to obtain the titles and abstracts of the publications, their authors, and the journals that published the literature, offering analyses as tables or plots.Automatic clustering of topics in abstracts is used to organize data.Additionally, an example is given to cross information with external data (in the example, the list of approved U.S. Food and Drug Administration drugs).To facilitate the comprehension of the workflow, different filters are applied in different paths in the Column Filter node. However, to optimize computer time, some of them could have been unified.Additionally, an example to cross information with external data (in the example the list of aproved U. S. Food and Drug Administration drugs) is given.To facilitate the comprehension of the workflow different filters were applied in differents paths. However to optimize computer time some of them could have been unified. Extraction of items withinformation of interestYearsPublication/yearTerms andtopicsJournalsDefine the searchAbstractsand publication yearFDA DrugsCheck machesDrugs in titleTitlesExample: select 2022abstractsPublications/authorAuthor StingXPath Column Filter Value Counter GroupBy Abstractpre-processing #2 Bag Of WordsCreator Value Counter Topic Extractor(Parallel LDA) RowID RowID Sorter Scatter Plot Table Indexer Column Filter Value Counter RowID Sorter European PubMed CentralAdvanced Search Column Filter Table Creator Case Converter Cross Joiner String Manipulation Row Filter Value Counter Column Filter Abstractpre-processing #1 Case Converter Sorter Index Query Missing Value Tag Cloud Color Manager String Replacer Sorter Detectionof authors Value Counter RowID Column Filter Missing Value Missing Value

Nodes

Extensions

Links