Icon

Contracts_​Fraud_​Detection_​Usecase_​example

Outlier Dection / Fraud Detection in Contracts
Outlier Detection / Fraud Detection in Contracts Discover anomalies in contracts payment amounts via: - data visualization - basic stats - clustering - isolation forest Loading theContracts 100 contracts as pdf filestransformed into Documentobjects We are going to use RegEx to extract the followinginformation from each contract- Date: the date, on which the contract was signed - Contract ID: Each document contains a 7-charecter ID (e.g.C000058)- Payment Amount`: Each document contain a paymentamount Finding outliers via visual and stats Finding outliers via clustering on one or morefeatures Finding outliers via Isolation Forest Finding outliers via basic stats finding outlierslist of outliersExtracting Contract ID, Date and Payment amounts from the texts using RegExbar chartscatter plotbox plothistogramcolor on paymentsto [0, 1]z-score|z| > thr?10 levels10 treesGetting the text from PDFs Numeric Outliers Tile View Extract Date, ContractID and Payment Amount via VisualAnalytics Color Manager Visualize Clusters Normalizer Visualize Clusters via k-Means via DBSCAN Normalizer Visualize Outliers via z-score Visualize Outliers H2O IsolationForest PDF Parser Outlier Detection / Fraud Detection in Contracts Discover anomalies in contracts payment amounts via: - data visualization - basic stats - clustering - isolation forest Loading theContracts 100 contracts as pdf filestransformed into Documentobjects We are going to use RegEx to extract the followinginformation from each contract- Date: the date, on which the contract was signed - Contract ID: Each document contains a 7-charecter ID (e.g.C000058)- Payment Amount`: Each document contain a paymentamount Finding outliers via visual and stats Finding outliers via clustering on one or morefeatures Finding outliers via Isolation Forest Finding outliers via basic stats finding outlierslist of outliersExtracting Contract ID, Date and Payment amounts from the texts using RegExbar chartscatter plotbox plothistogramcolor on paymentsto [0, 1]z-score|z| > thr?10 levels10 treesGetting the text from PDFs Numeric Outliers Tile View Extract Date, ContractID and Payment Amount via VisualAnalytics Color Manager Visualize Clusters Normalizer Visualize Clusters via k-Means via DBSCAN Normalizer Visualize Outliers via z-score Visualize Outliers H2O IsolationForest PDF Parser

Nodes

Extensions

Links