Icon

Fraud_​Detection_​DBSCAN_​Training

Fraud Detection using DBSCAN Clustering Algorithm

This workflow uses the DBSCAN clustering algorithm to detect fraud by identifying outliers in credit card transaction data. Density-based spatial clustering of applications with noise (DBSCAN) is a unsupervised clustering algorithm that works well with data that does not vary significantly across different parts of the dataset. We normalize the training data and sample a subset for analysis, outliers are tagged for potential fraud. Metrics are extracted at the end for viewing through the 'Scorer' node.

Steps taken for training:
1. Read Training Data
2. Data Preprocessing: Normalize the data into range [0,1] or [good,fraud] and Save Normalizer model
3. Train DBSCAN using Euclidean Distance
4. Mark Outliers and Evaluate Model Results

URL: Kaggle Dataset https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

Nodes

Extensions

Links