Icon

KNIME_​project_​DWBI (2)

Data cleaning
Extraction

Database connection & persistence

Dimension Tables / Data Transformation

Fact Table / Data Transformation

Joining Tables

TripDuration <=0
Row Filter
Rename Pickup & Dropoff Cols
Column Renamer
Green Trip Data
Parquet Reader
Yellow Trip Data
Parquet Reader
Yellow and Green DatasetWith all fields needed for db
Concatenate
All missing fares to 1
Missing Value
FIlter all but payment_type
Column Filter
Filtering non-relevant columns
Column Filter
Revenue calculation
Math Formula
CSV Reader
Joiner
Joiner
Joiner
Joiner
Joiner
Joiner
Renaming PK to FK
Column Renamer
Remove duplicates of payment_type
Duplicate Row Filter
payment_type_id to PaymentTypeID
Column Renamer
FIlter all but RatecodeID
Column Filter
PaymentType Table ready
Rule Engine
Renaming PK to FK
Column Renamer
RatecodeID to RateCodeID
Column Renamer
Renaming PK to FK
Column Renamer
FIlter all but VendorID
Column Filter
Renaming PK to FK
Column Renamer
Renaming PK to FK
Column Renamer
Remove duplicates of RatecodeID
Duplicate Row Filter
Joiner
RateCode Table Ready
Rule Engine
Renaming PK to FK
Column Renamer
Renaming PK to FK
Column Renamer
Joiner
passenger_count to PassengerCounttrip_distance to TripDistance
Column Renamer
Renaming PK to FK
Column Renamer
Clean missing or 0 vals
Row Filter
YELLOWClean Dropoff < pickup
Rule-based Row Filter
Clean missing or 0 vals
Row Filter
GREENClean Dropoff < pickup
Rule-based Row Filter
Rule Engine
Remove duplicates of VendorID
Duplicate Row Filter
Remove duplicates of TaxiTypeID
Duplicate Row Filter
Setting Trip_PK
Math Formula
TaxiType Table Ready
Rule Engine
FIlter all but TaxiType ID
Column Filter
Insert PaymentType
DB Writer
Microsoft SQL Server Connector
Rename service_zone to ServiceZone
Column Renamer
Insert Vendor
DB Writer
Insert RateCode
DB Writer
Rename Pickup & Dropoff Cols
Column Renamer
Insert DateTimeTable
DB Writer
Rename Pickup & Dropoff Cols
Column Renamer
Insert TaxiType
DB Writer
Generate Surrogate Key - PaymentType_PK
Math Formula
Insert Location
DB Writer
Generate Surrogate Key - Location_PK
Math Formula
Insert Trip
DB Writer
Generate Surrogate Key - Vendor_PK
Math Formula
Appending TaxiTypeID 1 for yellow taxis
Constant Value Column Appender
Generate Surrogate Key - RateCode_PK
Math Formula
Filtering all missing Foreign Keys
Row Filter
Generate Surrogate Key - DateTime_PK
Math Formula
Generate Surrogate Key - TaxiType_PK
Math Formula
Filtering LocationID {0,264,265}
Row Filter
FIlter all but PickupFullDatetime
Column Filter
Rename to FullDateTime
Column Renamer
Appending TaxiTypeID 2 for green taxis
Constant Value Column Appender
Add TimeBand Column
Rule Engine
Rename to FullDateTime
Column Renamer
FIlter all but DropoffFullDatetime
Column Filter
Extract separate datetime fileds
Date&Time Part Extractor
Add IsWeekend Column
Rule Engine
remove date duplicates
Duplicate Row Filter
Clean missing or 0 vals
Row Filter
YellowFilter Columns
Column Filter
Filtering PULocationID and DOLocationID {0,264,265}
Row Filter
GreenFilter PassCount
Column Filter
Concatenate
Concatenate
passenger_count to 1
Missing Value
Filtering trip_distance <=0
Row Filter
Filter passenger_count <1 and >8
Row Filter
Rename Pickup & Dropoff Cols
Column Renamer
TripDuration calculation
Date&Time Difference

Nodes

Extensions

Links