Icon

Data Cleaning Labs Solution

Lab 1: Cleaning Missing Ratings in Movie Table

Lab 2: Filtering Incomplete User Data

Lab 3: Replacement for Missing Loyalty Points

Lab 4: Removing Duplicate Movies

Lab 5: Detecting Duplicate Bookings

Lab 1: Correcting Incorrect Rating Types

Lab 8: Fixing Numeric Columns in Fooditemsize

Lab 13: Removing Irrelevant Columns from Payment Data

Lab 14: Dropping Columns with Too Many Missing Values

Lab 7: Validating Payment Dates

Lab 9: Formatting User Names

Lab 10: Standardizing Movie Genres

Lab 11: Filtering Out Unrated Movies

Lab 12: Filtering Failed Payment Records

Read movie.csv
CSV Reader
Handle missing ratings.
Missing Value
Read user.csv
CSV Reader
Filter out rows missing both email and phone.
Rule-based Row Filter
Read membership.csv
CSV Reader
Replace missing current_points with 100as a welcome bonus
Missing Value
This filters out sensitiveor unnecessary details.
Column Filter
Normalize name formatting
String Manipulation
Read payment.csv
CSV Reader
Convert transaction_datetime to KNIME Date/Time
String to Date&Time
Keeps only columns that are sufficiently complete.
Missing Value Column Filter
Read payment.csv
CSV Reader
String Manipulation
Convert Rating to Decimal
String to Number
Read user.csv
CSV Reader
Read movie.csv
CSV Reader
Read payment.csv
CSV Reader
Select Failed Transactions
Row Filter
This will keep only valid, non-null rating rows.
Row Filter
Read movie.csv
CSV Reader
Read movie.csv
CSV Reader
Read movie.csv
CSV Reader
Remove duplicate movie entries based on title.
Duplicate Row Filter
Convert rating column to numeric
String to Number
Fill Missing Ratingas Mean
Missing Value
Read user.csv
CSV Reader
Remove any duplicatebookings
Duplicate Row Filter
String to Number
Convert rate column to numeric
String to Number
Replace missing or invalid numeric ratings
Missing Value
Read payment.csv
CSV Reader
Read fooditemsize.csv
CSV Reader

Nodes

Extensions

Links