Icon

04_​Cross-Platform_​Ensemble_​Model

Will they blend? An Ensemble model from R, Python, and KNIME models

The challenge is to blend together models from different analytics platforms - i.e. Python , R, and KNIME - to create an ensemble model. Data is the “airline data set” (http://stat-computing.org/dataexpo/2009/the-data.html) enriched with additional external data , such as cities, daily weather (https://www.ncdc.noaa.gov/cdo-web/datasets/), US holidays, geo-coordinates, airplane maintenance. DepDealys is used as the target variable. R SVM, Python Logisitc Regression, and KNIME Decision Tree. Will they blend in a single ensemble model? ... and yes! They blend.

Will they blend? A Cross-Platform Ensemble Model from R, Python, and KNIME ModelsThe challenge is to blend together models from different analytics platforms - i.e. Python , R, and KNIME - to create an ensemble model. Data is the “airline data set” (http://stat-computing.org/dataexpo/2009/the-data.html) enriched with additional external data , such as cities, daily weather (https://www.ncdc.noaa.gov/cdo-web/datasets/), US holidays, geo-coordinates, airplane maintenance. DepDealys is used as the target variable.R SVM, Python Logisitc Regression, and KNIME Decision Tree. Will they blend in a single ensemble model?Blog post available at http://www.knime.org/blog/KNIMEAnalyticsPlatform-meets-R-and-Python ... and yes! They blend. 2007 for training 2008 for testing normalization Split between 2007 and 2008only ca 100Kdata rowsAirline Dataset and other blended data Binning Discretization Row Splitter ROC Curve Row Sampling Three Cross-PlatformModels Prediction Fusion Table Reader Will they blend? A Cross-Platform Ensemble Model from R, Python, and KNIME ModelsThe challenge is to blend together models from different analytics platforms - i.e. Python , R, and KNIME - to create an ensemble model. Data is the “airline data set” (http://stat-computing.org/dataexpo/2009/the-data.html) enriched with additional external data , such as cities, daily weather (https://www.ncdc.noaa.gov/cdo-web/datasets/), US holidays, geo-coordinates, airplane maintenance. DepDealys is used as the target variable.R SVM, Python Logisitc Regression, and KNIME Decision Tree. Will they blend in a single ensemble model?Blog post available at http://www.knime.org/blog/KNIMEAnalyticsPlatform-meets-R-and-Python ... and yes! They blend. 2007 for training 2008 for testing normalization Split between 2007 and 2008only ca 100Kdata rowsAirline Dataset and other blended data Binning Discretization Row Splitter ROC Curve Row Sampling Three Cross-PlatformModels Prediction Fusion Table Reader

Nodes

Extensions

Links