Uplift models could transform marketing’s customer targeting – so why isn’t everyone using them?
In this blog, we’ll cover:
- How marketers can use uplift modelling to find the customers most likely to be influenced by marketing – and therefore better target their spend.
- The challenges of uplift modelling and propensity models.
- How marketers can overcome these challenges to build uplift models that deliver real ROI.
The decisions that marketers make in the next few months will likely make or break the extent to which brands bounce back from the ravaging effects of the pandemic. Whether they’re planning the reopening of physical storefronts or deciding how to maintain a newly thriving eCommerce strategy, it’s clear they must spend every penny with pinpoint accuracy; right now, few companies have profit margins to spare.
As a result, many marketers are beginning to ask themselves a question that they’ve long assumed couldn’t be completely answered: how can I know how much of my marketing spend is wasted?
For many businesses, the answer is just as intimidating as the question: retailers need a way to predict which customers will actually change their behaviour in response to a piece of marketing. That way, instead of cannibalising revenue by sending discounts (or a subscription renewal offer, or a piece of direct mail) to customers who would have purchased the exact same products anyway, retailers can send personalised offers only to those who will actually be incentivised to buy more.
As I discussed in a previous blog, propensity models are useful to some extent here. We can use them to identify customers who the model suggests were unlikely to buy anyway – in most cases because they haven’t bought anything in a long time.
This sounds impressive, but there’s a catch. Our eCommerce experience – and that of our customers – shows that these consumers will rarely respond to personalised discount offers. Contacting them will mostly be a waste of time, though it won’t be an enormous waste of money. They probably won’t redeem the offer, but if they do you’ll have extra revenue that you likely won’t otherwise have had.
So, we have two sets of customers who we shouldn’t be targeting with offers. There are those who probably won’t buy again whatever happens – there’s little value to be gained from marketing to them. Then there are those who would probably have bought something organically – marketing to them is essentially revenue cannibalisation.
The real value on the table for marketers is in the centre ground. If you can quantify the tradeoff between likelihood to buy organically and likelihood to be influenced by an offer, you can find a sweet spot of customers where your marketing ROI will be highest.
If you can quantify the tradeoff between likelihood to buy organically and likelihood to be influenced by an offer, you can find a sweet spot of customers where your marketing ROI will be highest.
This is where a machine learning technique called uplift modelling comes into play.
What is uplift modelling?
The perfect uplift model is a little like a crystal ball. You focus on customers that hold certain attributes that are outside of your control – let’s say age range – and ask the crystal ball a counterfactual question about things that you can control, like your marketing spend.
You might ask:
- How much will this customer buy if we leave them alone?
- How much will they buy if we offer them free shipping?
- How much will they buy if we give them a £10 cashback offer?
Your crystal ball (or uplift model) will essentially analyse the attributes you can’t control (demographic, time since last purchase) and estimate the impact of doing or not doing the things you can control (like sending a discount or offering free shipping).
Why isn’t everyone doing this?
It’s easy to assume that you can glean this sort of information from a standard machine learning model, such as a pre-existing propensity model, in lieu of the crystal ball described above. After all, predicting future events based on known inputs is exactly what supervised machine learning is designed to do.
But this is unlikely to work for one crucial reason: any pre-existing machine learning model that your business uses will have been trained on historical data. And that completely skews any counterfactual predictions.
Let’s say, for example, that a retailer automatically sends a £20 cashback offer to any customer who has been inactive for 13 months. This prompts some customers to return and make another purchase, so they’ve been doing it for a while.
This means they don’t have any data recording how purchase behaviour changes when the retailer takes a different action. What happens if someone is inactive for 13 months but doesn’t receive a cashback offer? What happens if they receive a £30 cashback offer instead? Likewise, they won’t be able to assess how effective it would be to send the same offer when a customer has been inactive for only 10 months, or if they had waited until the customer had been inactive for 18 months.
These kinds of basic knowledge gaps seriously compromise the use of off-the-shelf machine learning tools to calculate incrementality. What’s the optimal moment in a period of customer inactivity to send an offer? At what point in this period are incremental revenue gains outweighed by cannibalisation? How much do these factors vary between customers? There’s no way to know.
What’s the optimal moment in a period of customer inactivity to send an offer? At what point in this period are incremental revenue gains outweighed by cannibalisation? How much do these factors vary between customers? There’s no way to know.
An algorithm trained on historical data will have a really good understanding of how sending a £20 cashback offer to someone who hasn’t bought in 13 months will affect their purchase behaviour. But it won’t be able to accurately estimate their probability of buying if they don’t receive that offer, if they receive another offer, or if they had received that same offer after fewer or more months of inactivity.
Nonetheless, if you ask the algorithm what it thinks would happen in any of those cases, it will give you a prediction. It won’t say “I don’t know” or alert you that this prediction is in any way untrustworthy. This is a case of what’s known in machine learning as a “manifold problem” or “going off-manifold”; a well-known problem in the ML community. If you pass your machine learning algorithm a datapoint with the right attributes, it will always give you a prediction, regardless of the values of those attributes. When we ask our model counterfactual questions which don’t look sufficiently like any of our past data, we actually want it to say “I don’t know”, or to attach a “trustworthiness value” to that prediction.
Many brands, buoyed by the success of standard off-the-shelf supervised learning techniques in other domains, have attempted to use propensity models to optimise their marketing strategies without having fully appreciated these subtleties. Unfortunately, the devil is often in the detail with these techniques; many of these attempts have ended up wasting money and alienating customers with badly targeted marketing.
However, with the right support and understanding, there are fixes for these problems.
How do we get uplift modelling right?
Marketing data scientists can do two things to mitigate the effects described above.
Build better algorithms
If no rules were used to do marketing in the past and offers were sent out effectively at random, there will be plenty of natural variation in the data. Your algorithm will therefore be able to deliver trustworthy predictions for most scenarios that you can think of.
But what happens when we ask an algorithm to predict the outcome of an action we’ve never taken before, like sending a customer a £30 cashback offer instead of a £20 cashback offer?
The algorithm has no data on how customers react to this more generous discount, so it’s impossible for it to make an accurate prediction. We want it to say “I don’t know” but instead it will return a prediction for what the customer will do if they receive a £30 cashback offer and not alert us to the fact that we should probably discard this prediction as unreliable.
We need a second algorithm to solve this. This algorithm is optimised to distinguish between data that looks like data it’s seen before and data that looks unfamiliar. It teaches the first algorithm to say ‘I don’t know’. There are a number of ways of approaching this problem, but generally they will tend to involve finding low dimensional representations of the data, using techniques such as variational autoencoders.
In this case, our second algorithm will tell us that the prediction for sending a £20 cashback offer is robust, whereas the prediction for sending a £30 offer cannot be trusted. This means that we can’t answer the question of “should we send a £20 or a £30 offer” using machine learning alone.
The real world is less black and white than this and data is rarely this monolithic. Often retailers will send out a one-off offer campaign to help hit revenue targets, or perhaps different people are in charge of different retention marketing campaigns and use slightly different rules. This creates some natural variation in the data. In this case, if we asked the propensity model to consider six different potential offers, our robustness-checking algorithm might tell us that only three of the predictions could be trusted.
Thus we could only answer which of these three offers is the right one to send.
Gather better data
It’s a commonly-held belief that any data science problem can be at least partially solved by just gathering more data. But, as we’ve discussed above, if you just keep gathering more data on what happens while pursuing the same policy, you will never be able to break out of that cycle.
But, as we’ve discussed above, if you just keep gathering more data on what happens while pursuing the same policy, you will never be able to break out of that cycle.
Instead, you need to run experiments, pursuing marketing policies different to BAU for certain subsets of customers. Such experiments are known as Randomised Control Trials (RCTs). RCTs are perhaps best known from the domain of drug trials, but their value is becoming increasingly realised in the consumer space (where they are, in fact, much easier to run).
While such experimentation has clear benefits, it does have a real-world cost. Experimentation will likely result in the cannibalisation of some revenue by sending generous cashback offers to people who would have bought either way, or missing out by not sending offers to some customers who would have responded positively to marketing. This is the cost of knowledge and long-term improvement.
At Faculty, we believe that the best solution is to blend the approaches. It’s necessary to explore different marketing policies in order to gather the data required to train trustworthy models, but we do this while helping our customers avoid scenarios which are likely to be highly unprofitable.
It’s up to the budget holders how far from business as usual they wish to explore and how much short term profit they’ll sacrifice in return for understanding the efficacy of a broader range of marketing strategies in the long term. Using an algorithm such as a variational autoencoder allows us, when deciding for a given individual what the best intervention is, to be explicit about which pool of interventions we’re choosing from.
Machine learning has the potential to transform marketing – but only if done correctly
As more companies begin to develop the quality of their data and their machine learning capabilities, using ML to improve the ROI of marketing and promotional activities will become an increasingly popular strategy.
But only some of these strategies will reach their full potential: many companies will be significantly held back by their failure to appreciate the subtle challenges that come with using ML in this context. Without that understanding, it’s tempting to try to get quick results by repurposing standard supervised learning techniques that have worked in other domains – and, as we’ve shown above, create models that don’t entirely deliver on their promises and can ultimately destroy value.
The companies that avoid this, choosing the right tool for the problem, partnering with experts in marketing ML and investing in good data and testing infrastructure, will be the ones destined for greatness.
See uplift models in action
If you’d like to find out more about uplift models – and how marketers can use them to model consumer behaviour and increase the impact of marketing – you can join Dr Gary Willis for our talk, ‘Beyond propensity modelling: tweaks to enhance your machine learning marketing’ on 25th February.
- Places where standard ML approaches like propensity models work – and where they fall down.
- Why tackling some of the most important problems in marketing requires more than just adding more data.
- How uplift models are quickly becoming more mainstream for modelling consumer behaviour and improving marketing impact.
You can register for the event here.