Icon

2018-2020_​Bike_​Restocking_​c8y_​v4.1

Lean Restocking Alert System: Combining Cumulocity (to manage IoT Devices, in this case bicycles) with KNIME (to visualize and analyze the data and make predictions)

This workflow implements an alarm system for bicycle restocking at Washington bike stations. Capital Bikeshare offers a download of their bike usage data dating as far back as 2010. They also offer a live REST-API. In this workflow, we use data from 2018 and 2019 as training data and evaluate the learned model on the data of the first three months of 2020.

The task is to predict one of three classes, that is whether a bike station needs: to remove bikes, to add bikes, or no action. Predicting 3 classes is easier than predicting a precise number and classification methods can be used. We only use the data that was provided by capital bikeshare (with some assumptions on the initial status of the stations at the end of 2017). The primary purpose of this workflow is to show how KNIME and Cumulocity can be used in concert. We therefore do not put much effort in the optimization of the machine learning model and also to not use any external data. A natural extension of this workflow would be for example to join weather data to the device data in order to improve the prediction quality.

Pre-processing - For each station, read event data from Cumulocity, note that there are 13 stations without any events (as of 03/2020) - Create time series data (ratio of number of available bikes to number of available docks per station for last 10 hours) - Infer target variable: "Will it be necessary to add or remove bikes from the given station in the next hour?" Train a default Random Forest Machine Learning Model Read station data Read station info as of May 2020 and join with device info for each station from Cumulocity.Also guess initial station 'load status'. Lean Restocking Alert System: Combining Cumulocity (to manage IoT Devices, in this case bicycles) with KNIME (to visualize and analyze the data and make predictions)This workflow implements an alarm system for bicycle restocking at Washington bike stations. Capital Bikeshare offers a download of their bike usage data dating as far back as 2010. They also offer a live REST-API. In this workflow, weuse data from 2018 and 2019 as training data and evaluate the learned model on the data of the first three months of 2020. The task is to predict one of three classes, that is whether a bike station needs: to remove bikes, to add bikes, or no action. Predicting 3 classes is easier than predicting a precise number and classification methods can be used. We onlyuse the data that was provided by capital bikeshare (with some assumptions on the initial status of the stations at the end of 2017). The primary purpose of this workflow is to show how KNIME and Cumulocity can be used in concert. Wetherefore do not put much effort in the optimization of the machine learning model and also to not use any external data. A natural extension of this workflow would be for example to join weather data to the device data in order to improvethe prediction quality. Cross-correlation Map (10th & E St NW) Some helpful visualizations Evaluation on an independent Test Set (January through March 2020) line plotstation Calvert & Biltmore St NWcolumn namesline colorscross-correlationretrieve info about all known devicesfilter for bike stationsevaluate onhold-outdata90-10 randomlearn withdefault settingsdistribution oftarget variablewithout samplingremove 'boring'events(instead of equalsize sampling)local copy of pre-processed datadistribution of target variableafter 'smart'samplinglearn withdefault settingsonce for each station97.65% correctsimplifiedevaluation(assuming independence)total sum per stationsink or source?Node 213 are stations sources orsinks?for eachstationlocal copyof pre-processed datalocal copyof station infolocal copyof station infocreatecorrespondingalarmsin Cumulocitycheck foralarmsfilter forrestockalarmslog progressAccuracy: 84.3%load pre-trainedmodelNode 247Row Filter Line Plot Row Filter Table Creator Color Manager Linear Correlation Add Station Info and GuessInitial Load Status Table Rowto Variable Loop End create timeseries data add target variable:event in 1 hour CumulocityDevice Retriever Row Filter Random ForestPredictor Partitioning Random ForestLearner GroupBy Rule-basedRow Filter Table Writer GroupBy Random ForestLearner Group Loop Start Table Rowto Variable Loop End Random ForestPredictor GroupBy Check for'true' errors Java EditVariable (simple) GroupBy Sorter Line Plot Java EditVariable (simple) filter events Chunk Loop Start CumulocityEvents Retriever Table Reader CumulocityEvents Retriever Table Writer Table Reader create timeseries data add target variable:event in 1 hour Domain Calculator CumulocityAlarms Creator CumulocityAlarms Retriever Rule-basedRow Filter Java Snippet Scorer Model Writer Model Reader CumulocityConnector CumulocityConnector Pre-processing - For each station, read event data from Cumulocity, note that there are 13 stations without any events (as of 03/2020) - Create time series data (ratio of number of available bikes to number of available docks per station for last 10 hours) - Infer target variable: "Will it be necessary to add or remove bikes from the given station in the next hour?" Train a default Random Forest Machine Learning Model Read station data Read station info as of May 2020 and join with device info for each station from Cumulocity.Also guess initial station 'load status'. Lean Restocking Alert System: Combining Cumulocity (to manage IoT Devices, in this case bicycles) with KNIME (to visualize and analyze the data and make predictions)This workflow implements an alarm system for bicycle restocking at Washington bike stations. Capital Bikeshare offers a download of their bike usage data dating as far back as 2010. They also offer a live REST-API. In this workflow, weuse data from 2018 and 2019 as training data and evaluate the learned model on the data of the first three months of 2020. The task is to predict one of three classes, that is whether a bike station needs: to remove bikes, to add bikes, or no action. Predicting 3 classes is easier than predicting a precise number and classification methods can be used. We onlyuse the data that was provided by capital bikeshare (with some assumptions on the initial status of the stations at the end of 2017). The primary purpose of this workflow is to show how KNIME and Cumulocity can be used in concert. Wetherefore do not put much effort in the optimization of the machine learning model and also to not use any external data. A natural extension of this workflow would be for example to join weather data to the device data in order to improvethe prediction quality. Cross-correlation Map (10th & E St NW) Some helpful visualizations Evaluation on an independent Test Set (January through March 2020) line plotstation Calvert & Biltmore St NWcolumn namesline colorscross-correlationretrieve info about all known devicesfilter for bike stationsevaluate onhold-outdata90-10 randomlearn withdefault settingsdistribution oftarget variablewithout samplingremove 'boring'events(instead of equalsize sampling)local copy of pre-processed datadistribution of target variableafter 'smart'samplinglearn withdefault settingsonce for each station97.65% correctsimplifiedevaluation(assuming independence)total sum per stationsink or source?Node 213 are stations sources orsinks?for eachstationlocal copyof pre-processed datalocal copyof station infolocal copyof station infocreatecorrespondingalarmsin Cumulocitycheck foralarmsfilter forrestockalarmslog progressAccuracy: 84.3%load pre-trainedmodelNode 247Row Filter Line Plot Row Filter Table Creator Color Manager Linear Correlation Add Station Info and GuessInitial Load Status Table Rowto Variable Loop End create timeseries data add target variable:event in 1 hour CumulocityDevice Retriever Row Filter Random ForestPredictor Partitioning Random ForestLearner GroupBy Rule-basedRow Filter Table Writer GroupBy Random ForestLearner Group Loop Start Table Rowto Variable Loop End Random ForestPredictor GroupBy Check for'true' errors Java EditVariable (simple) GroupBy Sorter Line Plot Java EditVariable (simple) filter events Chunk Loop Start CumulocityEvents Retriever Table Reader CumulocityEvents Retriever Table Writer Table Reader create timeseries data add target variable:event in 1 hour Domain Calculator CumulocityAlarms Creator CumulocityAlarms Retriever Rule-basedRow Filter Java Snippet Scorer Model Writer Model Reader CumulocityConnector CumulocityConnector

Nodes

Extensions

Links