Icon

Building a Churn Predictor

<p><strong>Building a Churn Predictor with Snowflake</strong></p><p>This workflow connects to a Snowflake database, loads and joins customer data, computes summary statistics, and prepares the data for machine learning. The data is split into training and test sets. The training data is converted for use with H2O, and a Random Forest model is trained to predict customer churn. The trained model is saved for future use and also applied at scale to the test data directly in the database.</p><p>Finally, the workflow evaluates the model's performance using accuracy statistics and a ROC curve to measure how well churn is predicted.</p>

URL: Snowflake Extension Guide | KNIME Documentation https://docs.knime.com/ap/latest/snowflake_extension_guide/#quickstart-with-snowflake-in-knime

Building a Churn Predictor with Snowflake


This workflow connects to a Snowflake database, loads and joins customer data, computes summary statistics, and prepares the data for machine learning. The data is split into training and test sets. The training data is converted for use with H2O, and a Random Forest model is trained to predict customer churn. The trained model is saved for future use and also applied at scale to the test data directly in the database. Finally, the workflow evaluates the model's performance using accuracy statistics and a ROC curve to measure how well churn is predicted.

Connect to Snowflake database
Write Telco data to Snowflake and join tables
Compute in-database summary stats
Sample and partition the dataset
Convert to H2O context, train and apply ML model
Score ML model
📍Technical Note:

If you don't have a Snowflake account, you can sign up for a free 30-day trial account (when signing up, select Enterprise edition and choose any Snowflake cloud/region - preferably AWS or Azure).

Configure the Snowflake Connector node:

After signing up, navigate to your Snowflake Account details. There, you'll find key information to configure the node:

  • Account identifier (input it in Full account name field)

  • Login name (input it in Authentication > Username & password field). The password field is the password used to log in to your Snowflake account.

  • Role (input it in Default access control role). Make sure the role is ACCOUNTADMIN.

  • Stay in your Account details and navigate to the Config File tab. Select the warehouse you prefer (e.g., SNOWFLAKE_LEARNING_WH, if you're using a free trial account) and input the warehouse name in the Virtual warehouse field.

📍Technical Note:

[For users with a free 30-day trial account] In the Write Telco data to DB component, keep the default configurations and simply execute the component. The Table names are defined automatically.

[For users with a regular account] Double-click on the Write Telco data to DB component and provide your preferred Database and Schema names (they must already exist). The Table names are defined automatically.

💡Pro tip: If your working with very large datasets, we recommend replacing the DB Writer node with the DB Table Structure Creator + DB Loader nodes for faster, bulk data loading to database.

AuC
ROC Curve
Score model
Scorer
DB Connection Closer
DB Reader
Connect todatabase
Snowflake Connector
Compute summary stats
DB GroupBy
Apply H2O model at scale in Snowflake
Snowflake H2O MOJO Predictor (Classification)
Top: Training (first 80%)Bottom: Test (bottom 20%)
DB Table Partitioner
Double-click andprovide Database nameand/or Schema nameNote: if database table already exists, it will be overwritten
Write Telco data to DB
Reshape data
Read data into KNIME table
DB Reader
Keep onlychurn and predicted column
DB Column Filter
H2O Local Context
Keep just first1000 rows for the sake ofthis example
DB Row Sampler
Save model for later usage
H2O MOJO Writer
Inner Join2 tables basedon customer ID
DB Joiner
Table to H2O
Train model
H2O Random Forest Learner
DB Reader
View summarystats for numeric columns
Table View
H2O Model to MOJO
Remove unnecessarycolumn
DB Column Filter

Nodes

Extensions

Links