OBJECTIVE
Define the priority of
action concerning customers, employees, products, and so on.
DESCRIPTION
Scoring models help to decide which elements to
act on as a priority based on the score that they obtain. For example, we can
create a scoring model to prevent employees leaving the company in which the
score depends on both the probability of leaving and the performance (we will
act first on those employees who have a higher probability of leaving and are
important to the company). Scoring models are also quite useful in marketing; for
example, we can score customers based on their probability of responding
positively to a telemarketing call and, based on our resources, call just the
first “X” customers.
The model that I will
propose concerns a scoring model of customers’ value based on the probability
of purchasing a product and on the amount that they are likely to spend. This
model is the result of two sub-models:
- - Purchase probability: We will use a logistic regression to estimate the purchase probability of a customer in the next period (see 26. RFM MODEL and 60. LOGISTIC REGRESSION);
- - Amount: We will use a linear regression to estimate the amount that each customer is likely to spend on his or her next purchase (see 38. LINEAR REGRESSION).
The first step is to
choose the predictor variables. In our case I suggest using recency, first
purchase, frequency, average amount, and maximum amount of year -2, but we
could try additional or different variables. The target variable will be a
binary variable that represents whether the client made a purchase during the
following period (year -1). A logistic regression is run with the eventual
transformation of variables and after verifying that all the necessary
assumptions are met (see 36.
INTRODUCTION TO REGRESSIONS and 60.
LOGISTIC REGRESSION).
Coefficients of the Logistic and Linear Regressions
In the second part of
the model, we can use for example only the average amount and the maximum amount
of year -2, and the total amount spent in year -1 is used as the target
variable. We run a multivariate linear regression with the eventual
transformation of the variables, after verifying that all the necessary
assumptions are met (see 36.
INTRODUCTION TO REGRESSIONS, 38.
LINEAR REGRESSION, and 39.
OTHER REGRESSIONS). It is important to note that in this
regression we will not use the whole customer database but select only those
customers who realized a purchase in year -1.
The last step is to
put together the two regressions to score customers based on both their
purchase probability and the likely amount that they will spend. We use the
regression coefficients for the estimates of each customer. In the linear
regression, we directly sum the intercept and multiply the variables’
coefficients (Figure below) by the actual values of each customer to
estimate the amount.[1]
However, in the logistic regression we should use the exponential function to
calculate the real odds of purchasing:
Probability = 1 / (1 +
exp(- (intercept coefficient + variable 1 coefficient * variable 1 + variable n
coefficient * variable n)))
Result Table with the Purchase Probability, Estimated Amount, and Final Score
Now that we have two
more columns in our database, we just need to add a third one for the final
score, which will be the purchase probability times the estimated amount (Figure above). With this indicator we can either rank our
customers (to prioritize marketing and resource allocation for some customers)
or use this indicator to estimate next-period revenues.
Download the Scoring Models Template
[1] Estimated amount = Intercept + Coefficient 1 * Variable 1 + Coefficient
2 * Variable 2.
Be aware that,
if we have transformed some of the variables, we cannot simply multiply the
coefficient but should make some additional calculations.
No comments:
Post a Comment