Icon

DSC723 - Flight Delays (Hanif)

Phase 2 - Data Understanding
Phase 3 - Data Preparation
Phase 1 - Business Understanding
  • Domain: US commercial aviation operations (2008 data).

  • Business problem: Arrival delays cause high operating costs, passenger dissatisfaction and penalties. Airlines and airports need to predict at-risk flights early.

  • Analytical question: Can we predict whether a flight will arrive significantly late (ArrDelay ≥ 15 min), and what are the main drivers of the delay?

  • Target (label): a new column DelayClass = Late if ArrDelay ≥ 15, otherwise OnTime.

  • Task type: Binary Classification.

  • Success criteria: high F1-score & ROC AUC on the test set; actionable insight.

  • Clearly state why DepDelay and the delay-cause columns are removed (to avoid data leakage) — this is the deeper understanding the rubric rewards.

Phase 4 - Modeling
Phase 5 - Evaluation
Decision Tree
Random Forest
Gradient Boosted Tree
Logistic Regression
Naive Bayes
Phase 6 - Visualisation
Check Imbalanced Data
Naive Bayes
Decision Tree
Random Forest
Gradient Boosted Tree
Logistic Regression
Data
CSV Reader
Linear Correlation
Decision Tree Predictor
Remove leakage:-Column0-Year-DepTime-ArrTime-FlightNum-TailNum-ActualElapsedTime-AirTime-ArrDelay-DepDelay-TaxiIn-TaxiOut-Cancelled-CancellationCode-Diverted-CarrierDelay-WeatherDelay-NASDelay-SecurityDelay-LateAircraftDelay
Column Filter
Row Sampler
Add DelayClasscolumn
Rule Engine
Statistics
Scorer
Table Creator
ROC Curve
Scorer
Scorer
Random Forest Predictor
Model Comparison Dashboard
Gradient Boosted Trees Learner
Late 1,247,488OnTime 680,883
Value Counter
SMOTE
Random Forest Learner
Logistic Regression Learner
Naive Bayes Predictor
Gradient Boosted Trees Predictor
Logistic Regression Predictor
Remove rowin ArrDelay,Cancelled=1Diverted=1
Row Filter
Missing Value (Apply)
Naive Bayes Learner
Data Explorer
Origin dengan DestCardinaliti tinggi, perlu hati-hati
Category to Number
Scorer
Value Counter
Scorer
Missing Value
Category to Number (Apply)
Change Month,DayOfWeek andDayofMonthbecome string
Number to String
ROC Curve
Normalizer
ROC Curve
Decision Tree Learner
Normalizer (Apply)
ROC Curve
Table Partitioner
ROC Curve

Nodes

Extensions

Links