Icon

JKISeason2_​2 PNJ

Credit Card company ABC maintains information about customer purchases and payments. The information is available for individual customers as Payments Info and Purchase Info. The company wants to segment the customers into three (3) clusters, so that marketing campaigns can be designed according to each cluster. You are asked to use both infos together to build a clustering model that adequately segments the customers. What patterns do customers in the same cluster have in common? Also, Information for newly registered customers is available. You are asked to assign cluster labels to newly registered customers using the trained clustering model, and then export the results into a CSV file. Do the assignments make sense? How do you assess their quality?

Data source: https://www.kaggle.com/datasets/arjunbhasin2013/ccdata?resource=download

DescriptionCredit Card company ABC maintains information about customer purchases and payments. The information is available for individual customers as Payments Info and Purchase Info. The company wants tosegment the customers into three (3) clusters, so that marketing campaigns can be designed according to each cluster. You are asked to use both infos together to build a clustering model that adequatelysegments the customers. What patterns do customers in the same cluster have in common? Also, Information for newly registered customers is available. You are asked to assign cluster labels to newlyregistered customers using the trained clustering model, and then export the results into a CSV file. Do the assignments make sense? How do you assess their quality?Data source (Kaggle): https://www.kaggle.com/datasets/arjunbhasin2013/ccdata?resource=download Info about the data:Following is the Data Dictionary for Credit Card dataset :-CUST_ID : Identification of Credit Card holder (Categorical)BALANCE : Balance amount left in their account to make purchases (BALANCE_FREQUENCY : How frequently the Balance is updated, score between 0 and 1 (1 = frequently updated, 0 = not frequently updated)PURCHASES : Amount of purchases made from accountONEOFF_PURCHASES : Maximum purchase amount done in one-goINSTALLMENTS_PURCHASES : Amount of purchase done in installmentCASH_ADVANCE : Cash in advance given by the userPURCHASES_FREQUENCY : How frequently the Purchases are being made, score between 0 and 1 (1 = frequently purchased, 0 = not frequently purchased)ONEOFFPURCHASESFREQUENCY : How frequently Purchases are happening in one-go (1 = frequently purchased, 0 = not frequently purchased)PURCHASESINSTALLMENTSFREQUENCY : How frequently purchases in installments are being done (1 = frequently done, 0 = not frequently done)CASHADVANCEFREQUENCY : How frequently the cash in advance being paidCASHADVANCETRX : Number of Transactions made with "Cash in Advanced"PURCHASES_TRX : Numbe of purchase transactions madeCREDIT_LIMIT : Limit of Credit Card for userPAYMENTS : Amount of Payment done by userMINIMUM_PAYMENTS : Minimum amount of payments made by userPRCFULLPAYMENT : Percent of full payment paid by userTENURE : Tenure of credit card service for user All customers with 12 months TENURE areconsidered for TRAININGRest all for TEST (< 12 months TENURE) 85% - 15% split 314 records were discardedbecause of missing values -observed in Data explorer N ORMALIZE - K-means algo - Denormalize VIEWS - Component Node 1Node 2Node 3Node 4Node 7Node 13Tenure >= 12 moTenure < 12 moCount by TenureNode 17Node 18Training DataTest DataNode 23Node 25Mean valuesNode 29Node 34Node 35Node 42Node 43Node 44 CSV Reader Data Explorer Missing Value Data Explorer Cluster Assigner Normalizer Row Filter Row Filter GroupBy Normalizer (Apply) k-Means Denormalizer Denormalizer Box Plot GroupBy Denormalizer Box Plot SilhouetteCoefficient Numeric Distances GroupBy Math Formula Math Formula(Multi Column) views DescriptionCredit Card company ABC maintains information about customer purchases and payments. The information is available for individual customers as Payments Info and Purchase Info. The company wants tosegment the customers into three (3) clusters, so that marketing campaigns can be designed according to each cluster. You are asked to use both infos together to build a clustering model that adequatelysegments the customers. What patterns do customers in the same cluster have in common? Also, Information for newly registered customers is available. You are asked to assign cluster labels to newlyregistered customers using the trained clustering model, and then export the results into a CSV file. Do the assignments make sense? How do you assess their quality?Data source (Kaggle): https://www.kaggle.com/datasets/arjunbhasin2013/ccdata?resource=download Info about the data:Following is the Data Dictionary for Credit Card dataset :-CUST_ID : Identification of Credit Card holder (Categorical)BALANCE : Balance amount left in their account to make purchases (BALANCE_FREQUENCY : How frequently the Balance is updated, score between 0 and 1 (1 = frequently updated, 0 = not frequently updated)PURCHASES : Amount of purchases made from accountONEOFF_PURCHASES : Maximum purchase amount done in one-goINSTALLMENTS_PURCHASES : Amount of purchase done in installmentCASH_ADVANCE : Cash in advance given by the userPURCHASES_FREQUENCY : How frequently the Purchases are being made, score between 0 and 1 (1 = frequently purchased, 0 = not frequently purchased)ONEOFFPURCHASESFREQUENCY : How frequently Purchases are happening in one-go (1 = frequently purchased, 0 = not frequently purchased)PURCHASESINSTALLMENTSFREQUENCY : How frequently purchases in installments are being done (1 = frequently done, 0 = not frequently done)CASHADVANCEFREQUENCY : How frequently the cash in advance being paidCASHADVANCETRX : Number of Transactions made with "Cash in Advanced"PURCHASES_TRX : Numbe of purchase transactions madeCREDIT_LIMIT : Limit of Credit Card for userPAYMENTS : Amount of Payment done by userMINIMUM_PAYMENTS : Minimum amount of payments made by userPRCFULLPAYMENT : Percent of full payment paid by userTENURE : Tenure of credit card service for user All customers with 12 months TENURE areconsidered for TRAININGRest all for TEST (< 12 months TENURE) 85% - 15% split 314 records were discardedbecause of missing values -observed in Data explorer N ORMALIZE - K-means algo - Denormalize VIEWS - Component Node 1Node 2Node 3Node 4Node 7Node 13Tenure >= 12 moTenure < 12 moCount by TenureNode 17Node 18Training DataTest DataNode 23Node 25Mean valuesNode 29Node 34Node 35Node 42Node 43Node 44 CSV Reader Data Explorer Missing Value Data Explorer Cluster Assigner Normalizer Row Filter Row Filter GroupBy Normalizer (Apply) k-Means Denormalizer Denormalizer Box Plot GroupBy Denormalizer Box Plot SilhouetteCoefficient Numeric Distances GroupBy Math Formula Math Formula(Multi Column) views

Nodes

Extensions

Links