Icon

Session 5 - Learn About Linear Regression - With Anomaly Detection Demo and Exercise

A Copy of Opt & Eval Different ObjectiveAbs Error Outcome Variable (Target) = Avg Sales Prev 3monthsThis is the Dependent VariableIndependent Variables (Predictors)CRI = Customer Relatioship Index (a VERY Engineered Feature)Prev_HQ Emails Opened (in the previous 3 months)Prev DTLM = Previous visits from Sales Representative to the HCP These variables are calculated by averaging lagged time series data--> They are Engineered Features Demo of the impact of outliers Above Here: Impact of Anomalous Data Points, Discovering and FIltering Out Anomalies, and How This Impacts a Regression Model Excercise:Study the Four Techniques of Anomaly Detection Workflow.Use one of the approaches in that WF to perform Anomalydetection and filter out the anomalies in the Metanode here forLinear Regression.Compare your results - the impact on the regression model -using Isolation Forest to your method. Bottom: DevTop: ValidationNode 136Sales Data(Small)Node 137Node 138Basic Linear RegressionUsing the Node's Algorithms for OLSNode 139Evaluation of the ResultsNode 140Mimic standard square-ErrorLinear Regression with Optimization (loop)Node 142Corr & Coefficient ValuesNode 145Node 146For Comparison:Smaller data - without the added outliersNode 150Simulated outliersOutliers added to Dev sampleBottom: DevTop: ValidationCofficients Change!Node 154Node 155Node 158Output is data without outliersCofficients Change backNode 160 Partitioning Table Reader Data Prep Lin Reg Eval Opt Opt Compare Performance Eval Opt Opt Lin Reg Excel Reader Concatenate Partitioning Lin Reg Statistics Statistics IF for OutlierDetection Lin Reg Math Formula A Copy of Opt & Eval Different ObjectiveAbs Error Outcome Variable (Target) = Avg Sales Prev 3monthsThis is the Dependent VariableIndependent Variables (Predictors)CRI = Customer Relatioship Index (a VERY Engineered Feature)Prev_HQ Emails Opened (in the previous 3 months)Prev DTLM = Previous visits from Sales Representative to the HCP These variables are calculated by averaging lagged time series data--> They are Engineered Features Demo of the impact of outliers Above Here: Impact of Anomalous Data Points, Discovering and FIltering Out Anomalies, and How This Impacts a Regression Model Excercise:Study the Four Techniques of Anomaly Detection Workflow.Use one of the approaches in that WF to perform Anomalydetection and filter out the anomalies in the Metanode here forLinear Regression.Compare your results - the impact on the regression model -using Isolation Forest to your method. Bottom: DevTop: ValidationNode 136Sales Data(Small)Node 137Node 138Basic Linear RegressionUsing the Node's Algorithms for OLSNode 139Evaluation of the ResultsNode 140Mimic standard square-ErrorLinear Regression with Optimization (loop)Node 142Corr & Coefficient ValuesNode 145Node 146For Comparison:Smaller data - without the added outliersNode 150Simulated outliersOutliers added to Dev sampleBottom: DevTop: ValidationCofficients Change!Node 154Node 155Node 158Output is data without outliersCofficients Change backNode 160 Partitioning Table Reader Data Prep Lin Reg Eval Opt Opt Compare Performance Eval Opt Opt Lin Reg Excel Reader Concatenate Partitioning Lin Reg Statistics Statistics IF for OutlierDetection Lin Reg Math Formula

Nodes

Extensions

Links