01_Training Distribution Method for Fraud Detection

Fraud Detection: Distribution Method Training

In this workflow, the Distribution Method is used to check for fraud. The Distribution method for classification is particularly useful in data where majority of the data is expected to follow a certain pattern or distribution. For credit card transactions, we can use this to help determine whether there is potential fraud or not. We start with reading in the training data from a sample dataset. The table is preprocessed to convert the classifiers of "0" or "1" to either "good" or "fraud". Next, the data undergoes a Z-score normalization, which standardizes the data to a mean of zero and a standard deviation of one, making it easier to compare different scales. The z-score normalization model is exported for later use in deployment. We analyze the data to check distributions and employ filters to isolate a single column (V5) and to exclude outliers beyond the 95% confidence intervals. The last step we mark the outliers and score the model on correctly/incorrectly identified transactions. The model score can be viewed using the 'Scorer' node.

The steps we perform are shown below:
1. Read Training Data
2. Data Preprocessing
3. Normalize Data
4. Save Model
5. Filter and Isolate
6. Mark Outliers and Score

URL: Kaggle Dataset https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

Nodes

Extensions

Download

To use this workflow in KNIME, download it from the below URL and open it in KNIME:

Download Workflow

Created by: Ali.Marvi

Created at: 2024-04-04

On NodePit since: 2025-01-25

Last update: 2025-08-12

Created with KNIME version: v5.4.0

Tags: Fraud DetectionDeploymentBankingDistributionCybersecurityKNIME for FinanceAudit & CompliancePracticing Data Science

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!

01_​Training Distribution Method for Fraud Detection

Nodes

Extensions

Links

Download

01_Training Distribution Method for Fraud Detection