Icon

data_​preprocessing

Binning

Group rare countries to "Other"

Handle Outliers

Redundant Column removal using Correlation

Stack Overflow Developer Survey 2023
CSV Reader
Turn "NA" string to real Missing Value
String Manipulation (Multi Column)
Manual Removal of Irrelevant Columns
Column Filter
String to Number
Remove redundant Columns
Correlation Filter
Purchase Influencelabel encoding
Rule Engine
Age Banding
Rule Engine
Education Bandingand Label Encoding
Rule Engine
Remove columns with more than50% missing values
Missing Value Column Filter
Replace Outliers with Percentile Value
Rule Engine
One Hot Encoding fornominal Columns
One to Many
Column Filter
Lookup Countries labeled as "Other"
Value Lookup
Count Entries by Country
GroupBy
Label countries with less than 400 entries as "Other"
Rule Engine
Convert qualitative to quantitative in "YearsCode"
String Manipulation
Only Retain Employed Respondentand Non Null Target Label
Row Filter
Container Output (Table)
Convert qualitative to quantitative in "YearsCodePro"
String Manipulation
Org Size Banding
Rule Engine
Handle null values(replace with median or most frequent)
Missing Value
Container Input (Table)
Numeric Outliers
Count Encoding for Multi Choice Columns
String Manipulation (Multi Column)
train.csv
CSV Writer
Joiner
test.csv
CSV Writer
Min-Max Scaling
Normalizer
Calculate 25% and 75% Salary percentile for each country
GroupBy
Calculate correlation Matrix
Rank Correlation
Split 80-20
Table Partitioner

Nodes

Extensions

Links