Icon

06_​Outlier_​Detection

Outlier Detection
Exercise: Outlier DetectionSome houses might be special cases in terms of size, price, and the year when they were sold or built. Let's clean the data from these houses in order to build a better model!1) Remove houses that have a sales price lying outside the interquartile range of all sales prices (Numeric Outlier node)- Select the "SalePrice" column- Set the interquartile range parameter to 1.52) Optional: Remove the 5 % of the houses that are the most extreme in terms of size (Normalizer and Rule-based Row Filter nodes)- Normalize the "Lot Area" column using z-score- Filter out houses whose normalized lot size is outside the range [-1.96, 1.96] Numeric Outliers Optional: Outliers in Distribution Tails Read AmesHousing.csv Preprocessing CSV Reader Exercise: Outlier DetectionSome houses might be special cases in terms of size, price, and the year when they were sold or built. Let's clean the data from these houses in order to build a better model!1) Remove houses that have a sales price lying outside the interquartile range of all sales prices (Numeric Outlier node)- Select the "SalePrice" column- Set the interquartile range parameter to 1.52) Optional: Remove the 5 % of the houses that are the most extreme in terms of size (Normalizer and Rule-based Row Filter nodes)- Normalize the "Lot Area" column using z-score- Filter out houses whose normalized lot size is outside the range [-1.96, 1.96] Numeric Outliers Optional: Outliers in Distribution Tails Read AmesHousing.csv Preprocessing CSV Reader

Nodes

Extensions

Links