Icon

01_​Load_​Clean_​and_​Explore

Loading and Exploring Data - Exercise (Solution)

This workflow accesses, preprocesses, and visualizes time series (energy consumption) data by
- converting time values from String to Date&Time
- aligning the time series by linear interpolation where gaps
- showing the hourly, daily, and monthly total values in a line plot

Data Loading Data Preparation Line Plot by hourNotice the daily and weekly seasonality Line Plot by dayNotice the weekly and yearly seasonality Line Plot by monthNotice the yearly seasonality Time Series Analysis01. Loading and Exploring DataSummary:In this exercise we will load the data file for cleaning, filtering, aggregating,and for some early visualizations.Instructions:1) Execute the CSV Reader node to load in the Energy Usage Data2) Use a String to Date&Time node to convert the Row ID column to thecorrect format. The digits in the string pattern are converted correctly, if youwrite "yyyy-MM-dd_HH" in the date format field, or press the "Guess datatype and format button".3) Use a Column Filter node to remove all columns except the Row ID andCluster 26, this is what we will analyze4) Use the Time Stamp Alignment component to check for missing timestamps in the data5) Connect a Missing Value node next to replace the missing valuesdiscovered in the previous step. Try the linear interpolation setting.6) Use separate Aggregation Granularity components to aggregate theTime series into Hourly, Daily, and Monthly series7) Use Line Plot nodes to visualize the outputs. Do you see any patterns?8) Open the 01_Additional_Visualizations workflow in the SupplementaryWorkflows folder and inspect the season plot, confidence bounds, and lagplot of the Time series. How to configure the Line Plot and Line Plot(Plotly) nodes?- Select the x-axis column, in our case the"aggregatedTimestamp" column in the dropdown menu- Include y-axis column, in our case the"Sum(cluster_26)" column in the Include/Exclude framework- Open the General Plot Options tab and writethe view title and axis labels in thecorresponding fields Hour, Sumconvertdate/timeinto Date&Time objectsNotice here:- daily seasonality- weekly seasonalityNotice here:- weekly seasonality- yearly seasonalityNotice here:- start of yearly seasonalityIntroducemissinghoursDay, SumMonth, SumEnergyusagedata AggregationGranularity Missing Value String to Date&Time Line Plot (Plotly) Line Plot (Plotly) Line Plot (Plotly) Column Filter Timestamp Alignment AggregationGranularity AggregationGranularity CSV Reader Data Loading Data Preparation Line Plot by hourNotice the daily and weekly seasonality Line Plot by dayNotice the weekly and yearly seasonality Line Plot by monthNotice the yearly seasonality Time Series Analysis01. Loading and Exploring DataSummary:In this exercise we will load the data file for cleaning, filtering, aggregating,and for some early visualizations.Instructions:1) Execute the CSV Reader node to load in the Energy Usage Data2) Use a String to Date&Time node to convert the Row ID column to thecorrect format. The digits in the string pattern are converted correctly, if youwrite "yyyy-MM-dd_HH" in the date format field, or press the "Guess datatype and format button".3) Use a Column Filter node to remove all columns except the Row ID andCluster 26, this is what we will analyze4) Use the Time Stamp Alignment component to check for missing timestamps in the data5) Connect a Missing Value node next to replace the missing valuesdiscovered in the previous step. Try the linear interpolation setting.6) Use separate Aggregation Granularity components to aggregate theTime series into Hourly, Daily, and Monthly series7) Use Line Plot nodes to visualize the outputs. Do you see any patterns?8) Open the 01_Additional_Visualizations workflow in the SupplementaryWorkflows folder and inspect the season plot, confidence bounds, and lagplot of the Time series. How to configure the Line Plot and Line Plot(Plotly) nodes?- Select the x-axis column, in our case the"aggregatedTimestamp" column in the dropdown menu- Include y-axis column, in our case the"Sum(cluster_26)" column in the Include/Exclude framework- Open the General Plot Options tab and writethe view title and axis labels in thecorresponding fields Hour, Sumconvertdate/timeinto Date&Time objectsNotice here:- daily seasonality- weekly seasonalityNotice here:- weekly seasonality- yearly seasonalityNotice here:- start of yearly seasonalityIntroducemissinghoursDay, SumMonth, SumEnergyusagedata AggregationGranularity Missing Value String to Date&Time Line Plot (Plotly) Line Plot (Plotly) Line Plot (Plotly) Column Filter Timestamp Alignment AggregationGranularity AggregationGranularity CSV Reader

Nodes

Extensions

Links