Icon

Session 3- Four_​Techniques_​Outlier_​Detection with Extras

Four Techniques for Outlier Detection

This workflow accesses a sample of data from the airline dataset and detects outlier airports based on the average arrival delay in them. The techniques applied are numeric outlier, z-score, DBSCAN and isolation forest. The outlier airports detected by each of these techniques are visualized on a map of US using the KNIME OSM integration.

URL: Four Techniques for Outlier Detection https://www.knime.com/blog/four-techniques-for-outlier-detection
URL: Airline data collected and published by DOT Bureau of Transportation Statistics https://www.transtats.bts.gov/OT_Delay/OT_DelayCause1.asp

Read Data Preprocess Data Outlier Detection - Numeric Outlier Outlier Visualization This workflow detects outliers in the data using the following techniques: numeric outlier, z-score, DBSCAN and isolation forest. Outlier Detection - Z-Score Outlier Detection - DBSCAN Outlier Detection - Isolation Forest Number of clustersMaximum number of iterationDetect outliersGroup by arrival airportRead airlinedataClusteringDistance function with SilouetteDetect outliersk-valueThreshold of zMinPtsEpsilonNode 834Isolation forestNode 839Node 840 MapViz IntegerConfiguration IntegerConfiguration ClusterVisualization Numeric Outliers Preproc Read data DBSCAN Numeric Distances Mark outliers Row Filter Density of delay Mark outliers Mark outliers Mark outliers MapViz MapViz MapViz DoubleConfiguration DoubleConfiguration IntegerConfiguration DoubleConfiguration Merge Variables Python Script Model for Neg Sentiment & Testof Outliers as Predictors Value Counter Read Data Preprocess Data Outlier Detection - Numeric Outlier Outlier Visualization This workflow detects outliers in the data using the following techniques: numeric outlier, z-score, DBSCAN and isolation forest. Outlier Detection - Z-Score Outlier Detection - DBSCAN Outlier Detection - Isolation Forest Number of clustersMaximum number of iterationDetect outliersGroup by arrival airportRead airlinedataClusteringDistance function with SilouetteDetect outliersk-valueThreshold of zMinPtsEpsilonNode 834Isolation forestNode 839Node 840MapViz IntegerConfiguration IntegerConfiguration ClusterVisualization Numeric Outliers Preproc Read data DBSCAN Numeric Distances Mark outliers Row Filter Density of delay Mark outliers Mark outliers Mark outliers MapViz MapViz MapViz DoubleConfiguration DoubleConfiguration IntegerConfiguration DoubleConfiguration Merge Variables Python Script Model for Neg Sentiment & Testof Outliers as Predictors Value Counter

Nodes

Extensions

Links