Icon

Churn

Here I am using up all the churn records andselecting the exact number of not churn records thatwill give me a 60/40 (not churn/churn) split.I have 1498 churn records and this is 40% of someunknown total. I will simply create an equation andsolve for this unknown total, then use that value tofind out how many non-churn records I need.0.4 of x = 1498x = 1498/0.4x = 3745 (Total records we can have in a samplewhere we use all our churn records, where churnrecords constitute 40% of our data based on a churncount of 1498). Now we know the total number of records we wouldhave, and the number of churn. The number of notchurn is therefore 3745-1498 = 2247. Plug this valueinto the row sampling node to select this many nonchurn records giving you a 60/40 not churn/churndistribution. Node 173.5% Not Churn26.5% ChurnSplit into churn and not churnChurn top (N=1498)Draw a random sample of60% not churn casesStack churn and not churn rows60% Not Churn40% Churn80/20test/train split CSV Reader GroupBy Row Splitter Row Sampling Concatenate GroupBy Partitioning Here I am using up all the churn records andselecting the exact number of not churn records thatwill give me a 60/40 (not churn/churn) split.I have 1498 churn records and this is 40% of someunknown total. I will simply create an equation andsolve for this unknown total, then use that value tofind out how many non-churn records I need.0.4 of x = 1498x = 1498/0.4x = 3745 (Total records we can have in a samplewhere we use all our churn records, where churnrecords constitute 40% of our data based on a churncount of 1498). Now we know the total number of records we wouldhave, and the number of churn. The number of notchurn is therefore 3745-1498 = 2247. Plug this valueinto the row sampling node to select this many nonchurn records giving you a 60/40 not churn/churndistribution. Node 173.5% Not Churn26.5% ChurnSplit into churn and not churnChurn top (N=1498)Draw a random sample of60% not churn casesStack churn and not churn rows60% Not Churn40% Churn80/20test/train split CSV Reader GroupBy Row Splitter Row Sampling Concatenate GroupBy Partitioning

Nodes

Extensions

Links