Icon

1_​decision trees_​solution

train

Model requirements

Data Collection

Data Cleaning

Data Labeling

Feature Engineering

Model Training

Model Evaluation

Model Deployment

Model Monitoring

Model requirements

Titanic - Machine Learning from Disaster | Kaggle

Problem Identification

Objectives & Resources

Data Collection

Load Data

Raw Data Extraction (CSV Reader)

Understand Data (Statistics)

Data Cleaning

Filter Columns that are important (Column Filter)

Add missing values (Missing Value)

Data Labeling

Prepare target Variable (Rule Engine)

$Survived$ = 1 => "Yes"

$Survived$ =0 => "No"

Calculate Domains (Domain Calculator)

Feature Engineering

One Hot Enconding (One to Many)

Cluster Data into train/test) (80/20)

Add Baseline (Constant Value Column Appender)

pred_baseline - No

Score Baseline (Scorer)

Model Training

Decistion Tree Learner

Model Evaluation

Decistion Tree Learner

Model Deployment

Decistion Tree Learner

Train Split
CSV Reader
Test Split
CSV Reader
Column Filter
Column Filter
Missing Value
Rule Engine
Domain Calculator
Table Creator
Constant Value Column Appender
Decision Tree Predictor
One to Many
Scorer
Table Partitioner
Decision Tree Learner
Scorer
ROC Curve
Decision Tree Predictor

Nodes

Extensions

Links