Icon

Group2_​1_​Training_​Evaluation_​and_​Optimization

Group 2 Training, Evaluation and Optimization

Tasks for Group 2 in KNIME Data Science Learnathon
- Train a Decision Tree on the training set, and apply the model to the test set
- Evaluate the performance of the Decision Tree model
- Train a Logistic Regression model on the training set, and apply the model to the test set
- Evaluate the performance of the Logistic Regression model
- Optimize the tree depth of a Random Forest model, and train and apply a Random Forest model using the optimal parameter value
- Evaluate the performance of the Random Forest model
- Compare the performances of the different models using scoring metrics for a classification model and an ROC Curve
- Write the best performing model to a file

Challenge: Model Training, Evaluation and OptimizationGoal: Train a number of data analytics models to predict departure delays at a selected airport (ORD).Datasets: 1. AirlineDataset.table 2. GHCN-Daily_source.xls contains daily weather information like precipitation, snowfall, snow depth, temperature, wind speed and wind directionmeasured at Chicago O'Hare International Airport. (The explanation of the columns is available in the GHCN_daily_readme file in the “data” folder)Suggested Steps: Group 2. Model Training to Predict Departure Delays training set test set Bag of Models See instructions inside the metanodeOpen by- double-clickOR- Right-click ->Metanode -> OpenExecute this metanodefirst! First ML Model -Decision Tree Second ML Model -Linear Regression Third ML Model - Gradient BoostedTree with Parameter Optimization Exporting the BestPerforming Model Data Access, Cleaning,and Partitioning Challenge: Model Training, Evaluation and OptimizationGoal: Train a number of data analytics models to predict departure delays at a selected airport (ORD).Datasets: 1. AirlineDataset.table 2. GHCN-Daily_source.xls contains daily weather information like precipitation, snowfall, snow depth, temperature, wind speed and wind directionmeasured at Chicago O'Hare International Airport. (The explanation of the columns is available in the GHCN_daily_readme file in the “data” folder)Suggested Steps: Group 2. Model Training to Predict Departure Delays training set test set Bag of Models See instructions inside the metanodeOpen by- double-clickOR- Right-click ->Metanode -> OpenExecute this metanodefirst! First ML Model -Decision Tree Second ML Model -Linear Regression Third ML Model - Gradient BoostedTree with Parameter Optimization Exporting the BestPerforming Model Data Access, Cleaning,and Partitioning

Nodes

Extensions

Links