Icon

KNIME_​project_​coffee_​final

Import Dataset
CSV Reader
General check for the data type
Statistics View
Distribution of Quantity, quite equally but not normal distribution
Histogram
Export as coffee_clean.csv file for the future use.
CSV Writer
Check the correlation between Item and payment method, most them highly related
Heatmap
Check the outliers for Price Per Unit, No finds
Box Plot
The relationship between Transaction date and Item
Scatter Plot
Distribution of Price Per Unit, quite equally but not normal distribution
Histogram
Because with all string we can't apply EDA so I have to change some rows from string to number first
String to Number
Format: Change the string to Date type
String to Date&Time
Missing value: Delete the rows with missing values in item and Transaction data because it's very less; Replace missing value with 0 in Quantity and Price Per Unit;
Missing Value
Duplicate: No duplications
Duplicate Row Filter
Format: Leave only Data and put it in a new column
String Manipulation
Format: Leave only Year and put it in a new column
String Manipulation
Check the outliers for Quantity, No finds
Box Plot

Nodes

Extensions

Links