Icon

Retail Mkting campaign assignment

Load data
Model buiding and evaluation

OBJECTIVE: Develop machine learning models to predict the positive response to a marketing campaign. Use the provided dataset to develop and evaluate the models. The decription of the dataset variables is provided in the "Table view" node. Evaluate the 3 models (Logistic Regression, Decision Tree and Random forest) considering that 70% of the observations will be used to train the models and 30% will be used to test them. Interpret the modeling results and select the best model based on performance on the test set.

Before running the workflow select your batch and enter your group number by double-clicking on the "Model results" green component. Then execute the whole workflow and check the results.

QUESTION 1: run the "Model results" component and open the view to see the results; interpret the first 3 levels of the Decision Tree (i.e. the three levels below the root node), providing an assessment of some of the predictors used for each split and the resulting partitions. Focus on the most interesting results. Note: by clicking on "Table" in each node of the tree you can see the details of the target variable distribution. Pay attention: the target variable in the training set is imbalanced!

QUESTION 2
: Interpret the Random forest variable importance and, togheter with the evidence from the intepretation of the Decision Tree, select two potential target groups for the next marketing campaign. Describe the characteristics of this two groups and define a set of actions you will perform to engage the customers in these groups. Note: in the selection of the groups remember that the target variable is imbalanced in the training data.

QUESTION 3: Evaluate the performance of the models by looking at the ROC curve, the area under the curve (AUC), and the confusion matrix metrics. Which model will you choose to predict the positive response to a marketing campaign for a new group of customers? Why? Note: The Confusion Matrix threshold shown in the results was chosen to maximize the Youden's index.

QUESTION 4: Using the model you selected in the previous step (the best one), what is the PRECISION of the model if you want it to correctly detect approximately 90% of the customers who will actually respond positively to the marketing campaign? (Note: Manually adjust the classification threshold to find the answer). What is the impact of this result (i.e. the precision of the model) on the practical use of the model to select a list of customers for the next marketing campaign?

Data information
Load data
Excel Reader
Double-click and enter your group number before running the node!
Model results
Variables description
Table View
Data dictionary
Excel Reader

Nodes

Extensions

Links