Icon

05. Time Series

Time Series

"Time Series" exercise for advanced Life Science User Training - Extract granularities from a timestamp - Aggregate by time granularities - Calculate moving average - Calculate moving aggregation

Literature Search on PubmedThe goal of this exercise is to get all the publications associated to smallpox from the Pubmed Database and analyze how the number of publications havechanged over time. Step 1The Component PubMed DocumentExtractor loads all publications associated tosmallpox.The Document Data Extractor node extractstitle, abstract, author and publication date foreach publicaiton.The Column Filter node remove document andquery from the data table. Step 2Use ExtractDate&TimeFields to extractyear to aseparatecolumn. Step 3Remove missing integervalues using the MissingValue node.Use GroupBy node togroup all publication peryear and count thenumber of titles. Step 4Use the MovingAverage node tocalculate the averageof the titles using theCenter Gaussian witha Window of 9. Step 5Use the MovingAggregation node tocalculate the maximumof the titles using thecentral window of size9. Activity I: Conversion and FilteringConvert publication dates to Date&Time format and filter for publication during 1970-2000 Step 1Read the publicationsfrom smallpox.csv usingthe File Reader node Step 2Convert the publicationdate from string formatto Date&Time formatusing the String toDate&Time node Step 3Filter for allpublications from1970-2000 using theDate&Time-basedRow Filter Activity II: Analyse Time SeriesAnalyze the amount of publications related to smallpox over the last years Node 685Node 686Node 687Node 688Node 689Node 690Node 691 Column Filter Missing Value Document DataExtractor PubMed DocumentExtractor Line Plot CSV Reader String to Date&Time Date&Time-basedRow Filter Extract Date&TimeFields Moving Average Moving Aggregation GroupBy Literature Search on PubmedThe goal of this exercise is to get all the publications associated to smallpox from the Pubmed Database and analyze how the number of publications havechanged over time. Step 1The Component PubMed DocumentExtractor loads all publications associated tosmallpox.The Document Data Extractor node extractstitle, abstract, author and publication date foreach publicaiton.The Column Filter node remove document andquery from the data table. Step 2Use ExtractDate&TimeFields to extractyear to aseparatecolumn. Step 3Remove missing integervalues using the MissingValue node.Use GroupBy node togroup all publication peryear and count thenumber of titles. Step 4Use the MovingAverage node tocalculate the averageof the titles using theCenter Gaussian witha Window of 9. Step 5Use the MovingAggregation node tocalculate the maximumof the titles using thecentral window of size9. Activity I: Conversion and FilteringConvert publication dates to Date&Time format and filter for publication during 1970-2000 Step 1Read the publicationsfrom smallpox.csv usingthe File Reader node Step 2Convert the publicationdate from string formatto Date&Time formatusing the String toDate&Time node Step 3Filter for allpublications from1970-2000 using theDate&Time-basedRow Filter Activity II: Analyse Time SeriesAnalyze the amount of publications related to smallpox over the last years Node 685Node 686Node 687Node 688Node 689Node 690Node 691 Column Filter Missing Value Document DataExtractor PubMed DocumentExtractor Line Plot CSV Reader String to Date&Time Date&Time-basedRow Filter Extract Date&TimeFields Moving Average Moving Aggregation GroupBy

Nodes

Extensions

Links