Icon

A) Case - Fraud detection

Model Training and Evaluation
Exploration and Preparation
Load data
Deployment
CONTEXT: We are considering a large banking company operating in Europe that is facing issues related to fraud against its customers. DATASET: Consider the datasets 'fraud_dev.csv' and 'fraud_scoring.csv', which include behavioural data of customers of the bank. Some customers included in the dataset are related to fraudulent activity on their bank account (consider the variable FLG_FRAUD, where 1="fraud" and 0="otherwise"). GOAL: The bank's objective is to predict possible frauds before they occur and to take appropriate countermeasures to stop the most likely frauds. Can a ML model effectively predict potential frauds? MISCLASSIFICATION COSTS: The average cost of a fraud incident is $50,000, while the estimated investigation cost of identifying a potential fraud (that later turns out to be a false positive) is $400.
Column Appender
Import scoringdataset
CSV Reader
Data Cleaning and Feature Engineering
Import dev dataset
CSV Reader
Risk Classes
Pie Chart
Logistic Regression
Model 1
Train 70% Test 30%
Table Partitioner
Random Forest
Model 2
Color Manager
Balancing
Model evaluation
Clients list export
Excel Writer
Data prep
Random Forest Predictor
Exploratory Data Analysis
Risk classes
Numeric Binner

Nodes

Extensions

Links