0 ×

01_​Analytics

Workflow

Analytics - Model Selection to Predict Flight Departure Delays
data sciencemachine learningmodel selectiondata preparationETLairline data setflight delay
Read & Blend Data Model Selection to Predict Flight Departure Delays This workflow trains a number of data analytics models and automatically selects the best model to predict departure delays from a selected airport.Data is the airline dataset downloadable from: http://stat-computing.org/dataexpo/2009/the-data.html. Departure delay is a delay > 15min. Default selected airport is ORD.This workflow shows data reading, data blending, ETL, guided analytics, dimensionality reduction, advanced data mining models, and model selection. Model Factory- Predictive Model for Each Airline Model Selection & Comparsion - ROC Curve - A/B Test using a Cross Validation - Lift Chart Advanced ETL Functionality & MachineLearning-based Pre-processing - Outlier Detection - Dimensionality Reduction - Feature Generation - Missing Values - Discretization - Normalization - Automatic Dimensionality Reduction(SVD, PCA) - Machine Learning for Feature Selection Model Training - Bag of Models - Decision Tree & Random Forest - Neural Network & Deep Learning - Gradient Boosted Trees - Logistic Regression (in KNIME) - Logistic Regression (in R) - Your own Ensemble Model - Random Forest (with H2O) Warning! The dataset used here is just a subset of the original dataset.Therefore final model performances will be different than what reported in videos https://youtu.be/IEAsUTN8q68 and https://youtu.be/rvTHhgCKQiwThe original datasets can be downloaded from The full datasets can be found under the following links:- airline dataset: http://stat-computing.org/dataexpo/2009/the-data.html- calender and weather information: https://developers.google.com/google-apps/calendar/ https://www.ncdc.noaa.gov/data-access/land-based-station-data/land-based-datasets/global-historical-climatology-network-ghcn Note.Some nodes require R and Pythoninstalled.How to install Python:https://www.knime.com/blog/setting-up-the-knime-python-extension-revisited-for-python-30-and-20R in KNIME video:https://youtu.be/Dfm4-RuABmY Write the best model Model Selection Model Factory Table Writer Bag of Models Advanced ETL & MLbased Pre-processing Read blended data Read & Blend Data Model Selection to Predict Flight Departure Delays This workflow trains a number of data analytics models and automatically selects the best model to predict departure delays from a selected airport.Data is the airline dataset downloadable from: http://stat-computing.org/dataexpo/2009/the-data.html. Departure delay is a delay > 15min. Default selected airport is ORD.This workflow shows data reading, data blending, ETL, guided analytics, dimensionality reduction, advanced data mining models, and model selection. Model Factory- Predictive Model for Each Airline Model Selection & Comparsion - ROC Curve - A/B Test using a Cross Validation - Lift Chart Advanced ETL Functionality & MachineLearning-based Pre-processing - Outlier Detection - Dimensionality Reduction - Feature Generation - Missing Values - Discretization - Normalization - Automatic Dimensionality Reduction(SVD, PCA) - Machine Learning for Feature Selection Model Training - Bag of Models - Decision Tree & Random Forest - Neural Network & Deep Learning - Gradient Boosted Trees - Logistic Regression (in KNIME) - Logistic Regression (in R) - Your own Ensemble Model - Random Forest (with H2O) Warning! The dataset used here is just a subset of the original dataset.Therefore final model performances will be different than what reported in videos https://youtu.be/IEAsUTN8q68 and https://youtu.be/rvTHhgCKQiwThe original datasets can be downloaded from The full datasets can be found under the following links:- airline dataset: http://stat-computing.org/dataexpo/2009/the-data.html- calender and weather information: https://developers.google.com/google-apps/calendar/ https://www.ncdc.noaa.gov/data-access/land-based-station-data/land-based-datasets/global-historical-climatology-network-ghcn Note.Some nodes require R and Pythoninstalled.How to install Python:https://www.knime.com/blog/setting-up-the-knime-python-extension-revisited-for-python-30-and-20R in KNIME video:https://youtu.be/Dfm4-RuABmY Write the best model Model Selection Model Factory Table Writer Bag of Models Advanced ETL & MLbased Pre-processing Read blended data

Download

Get this workflow from the following link: Download

Nodes

01_​Analytics consists of the following 549 nodes(s):

Plugins

01_​Analytics contains nodes provided by the following 18 plugin(s):