Icon

H2O_​Isolation_​Forest_​Outlier_​Detection_​with_​Shapley_​Values

H2O Isolation Forest for Outlier Detection Explained by Shapley Values
Outlier Detection using an Isolation Forest Model with H2O This tutorial shows how to train an H2O Model in KNIME. We will train an Isolation Forest Model to detect frauds, i.e. outliers or anomalies1. Prepare: Load the data and import the resulting KNIME Table to H2O.2. Learn: We learn the Isolation Forest Model using the H2O Isolation Forest Learner. We want H2O to build 100 trees. All other model parameters are H2Os defaults. 3. Predict: Make predictions on the same data using your model(s). In the output there will be the predictions (normalized anomaly score) and mean lengths of the predicted decision treepaths.For further processing we convert the H2O Frame back to table.4. Classify: If we know that about 5 percent of our data rows are anomalies, we can calculate the 95th quantile. This quantile can be used as a threshold by the Rule Engine node to classifyeach row either as an anomaly or not.5. Explain: We use Shapley Values to explain only the ones predicted as anomalies/outliers. 1. Prepare 2. Learn 3. Predict 5. Explain 4. Classify Import data to H2O Frame.Predict anomalyscores.Compute 95th quantileof prediction.Apply 95th quantileas threshold.Convert quantileto flow variable.Start local H2O Node.Convert H2O Frameback to table.Learn the IsolationForest with 100 trees.Import data to H2O Frame.Convert H2O Frameback to table.Predict anomalyscores.Table to H2O H2O IsolationForest Predictor GroupBy Rule Engine Table Row to Variable(deprecated) H2O Local Context H2O to Table H2O IsolationForest Learner Table Reader Shapley ValuesLoop Start Shapley ValuesLoop End Row Splitter Table to H2O H2O to Table H2O IsolationForest Predictor Outlier Detection using an Isolation Forest Model with H2O This tutorial shows how to train an H2O Model in KNIME. We will train an Isolation Forest Model to detect frauds, i.e. outliers or anomalies1. Prepare: Load the data and import the resulting KNIME Table to H2O.2. Learn: We learn the Isolation Forest Model using the H2O Isolation Forest Learner. We want H2O to build 100 trees. All other model parameters are H2Os defaults. 3. Predict: Make predictions on the same data using your model(s). In the output there will be the predictions (normalized anomaly score) and mean lengths of the predicted decision treepaths.For further processing we convert the H2O Frame back to table.4. Classify: If we know that about 5 percent of our data rows are anomalies, we can calculate the 95th quantile. This quantile can be used as a threshold by the Rule Engine node to classifyeach row either as an anomaly or not.5. Explain: We use Shapley Values to explain only the ones predicted as anomalies/outliers. 1. Prepare 2. Learn 3. Predict 5. Explain 4. Classify Import data to H2O Frame.Predict anomalyscores.Compute 95th quantileof prediction.Apply 95th quantileas threshold.Convert quantileto flow variable.Start local H2O Node.Convert H2O Frameback to table.Learn the IsolationForest with 100 trees.Import data to H2O Frame.Convert H2O Frameback to table.Predict anomalyscores.Table to H2O H2O IsolationForest Predictor GroupBy Rule Engine Table Row to Variable(deprecated) H2O Local Context H2O to Table H2O IsolationForest Learner Table Reader Shapley ValuesLoop Start Shapley ValuesLoop End Row Splitter Table to H2O H2O to Table H2O IsolationForest Predictor

Nodes

Extensions

Links