Tech Blog

Machine learning model explainability through Shapley values

31 Oct 2019

In the first of our series of technical blogs exploring explainability, we will look at how to use Shapley values to explain black-box models.

Model explainability aims to provide visibility and transparency into the decision making of a model.

On a global level, this means that we understand which features the model is using, and to what extent, when making a decision. For each single feature, we would want to understand how this feature is used, depending on the values it takes.

And on a local level, that is, for any individual data point, we would want to see why the model made a certain decision. This can give us more insight into where and why the model might fail.

What are Shapley values?

Shapley values are a concept from game theory (see Wikipedia for more info). Shapley values measure how much an individual player contributes to a game.

For this, we look at each coalition of players and which outcome they achieve.

The Shapley value of player $x$ is defined as the weighted average difference between the coalitions that include player $x$ and those that don’t.

When using Shapley values for model explainability, we can think of each feature as a player, the game being the prediction of the target variable, and the score being the model output when predicting the target.

We calculate Shapley values by taking different coalitions of features, that is, various number of features with their true values.

In an ideal world, we would only pass those features, but unfortunately, a trained ML model needs to be passed a value for each feature it has been trained on.

We therefore fill features that are not part of the current coalition with random splices from the dataset.

The number of possible coalitions grows exponentially, so in practice we typically need to sample coalitions.

The noise that is introduced by random splicing and the sampling of coalitions causes the Shapley values to be estimates, and means they will have an uncertainty associated with them.

The more coalitions we sample, the smaller this uncertainty. The equation below describes how local Shapley values are calculated:

$\phi_y(x_i; x) = \sum_{i \notin S} \frac{|S|! (n-1-|S|)!}{n!} \left[ f_y(x_S \cup x_i) - f_y(x_S) \right]$

Here, $\phi_y(x_i; x)$ is the Shapley value for feature $x_i$ for one datapoint $x$ in our dataset.

The sum is over all coalitions $S$ not containing feature $i$ , and $n$ is the dimensionality of $x$ .

The factor $\big[f_y(x_S \cup x_i) - f_y(x_S)\big]$ is the marginal contribution to $f_y(x)$ that feature $x_i$ makes when added to coalition $S$ .

This factor gets multiplied by $|S|!$ and $(n-1-|S|)!$ , the number of permutations of $x_S$ and $x \setminus (x_S \cup x_i)$ , respectively.

Dividing this by $n!$ makes the Shapley value $\phi_y(x_i; x)$ an average over all orderings in which $x$ can be constructed.

Shapley values have a number of useful properties and benefits over other measures of feature importance:

Unit: Shapley values sum to the model accuracy.
Symmetry: Two features that have the same importance also have the same Shapley value.
Linearity: When building a linear ensemble model, the total Shapley value of a feature is the linear combination of its Shapley values across models.

Shapley values can be defined on a global level, indicating how the model overall uses the features, and a local level, indicating how the model made a decision for an individual data point.

The local Shapley values sum to the model output, and global Shapley values sum to the overall model accuracy, so that they can be intuitively interpreted, independent of the specifics of the model.

In what follows, we’ll walk through an example data set and see how global and local Shapley values can be calculated, visualised, and interpreted.

The dataset we are using is the Lending Club dataset. LendingClub is the world’s largest peer-to-peer lending platform. According to Wikipedia:

Lending Club enables borrowers to create unsecured personal loans between $ 1,000 and $ 40,000. Investors can search and browse the loan listings on Lending Club website and select loans that they want to invest in based on the information supplied about the borrower, amount of loan, loan grade, and loan purpose. Investors make money from interest. Lending Club makes money by charging borrowers an origination fee and investors a service fee.

We’ve trained a neural network on Lending Club’s data to predict loan outcomes: charged off versus fully paid.

Global Shapley values

We’ll start with an example for global Shapley values. They tell us how the model works overall.

For each feature, we draw multiple coalitions as described above and calculate the change in model accuracy as we add the feature in question to the coalition.

The weighted average of this change in accuracy over all drawn coalitions is the estimate of the Shapley value.

The uncertainty in the estimate is the error in the mean, that is, the standard deviation over all drawn coalitions, divided by $\sqrt{m}$ , where $m$ is the number of coalitions that we sampled:

$\frac{1}{M} \sum_{m=1}^M \left( (\hat{f}(x^{m}_{+j}) - \hat{f}(x^{m}_{-j}) - \hat{\phi}_{j} \right)^2$

We can visualise these results to see which features are more or less important.

The most important feature by far is sub_grade. The overall small error bars show us that we can be fairly certain about the ordering and the values of these Shapley values.

However, if those were larger, we would need to sample more coalitions – in particular for models with a large number of features, this can become an issue.

Local Shapley values

We’ll now move on to calculating local Shapley values. Local Shapley values are calculated on a single-row basis, i.e., for a single datapoint.

There are two things we could look at:

1) What is the distribution of local Shapley values for a given feature across the dataset?

This tells us how the model might be using the feature differently, depending on the value the feature takes.

2) What are the local Shapley values of all features for a specific row in my dataset?

This tells us why the model made a particular decision for a row in our dataset.

1) Distribution of local Shapley values

We’ll look at the feature fico_score, which is the third most important feature of our model, based on the global Shapley values.

We will calculate the local Shapley value for a number of datapoints, and can visualise which Shapley value fico_score takes, depending on the value of fico_score for a given datapoint.

In the plot, we further colour the data points depending on their true label, and set the marker depending on the model prediction, which could be correct or incorrect.

We see that most data points fall closely together, and that the local Shapley values of fico_score vary between -0.1, indicating that they’d support a model decision of the wrong label and +0.15, supporting a model decision of the true label.

From this plot, we can also identify outliers – points that do not fall within that narrow band, and where the model makes the wrong prediction.

Of course, we would now like to understand better why the model made the wrong prediction.

To that end, we can look at a similar bar chart as before, but this time calculating the local Shapley values of all features for the particular data point that we are interested in:

The most important feature by far here is application_type – but we see that it is actually pushing the model prediction away from the true label. Why would this happen?

We can plot the distribution of application_type, split by the true label. We see that the application type is highly imbalanced, and only a small number of applications are filed jointly.

The datapoint which we were investigating fell into the latter category – and due to the lack of data here, our model is less accurate and cannot be trusted for people filing joint applications.

This example highlights how important explainability for ML models is. Without it, we would have missed this relationship, and would have judged our model on its overall performance. By calculating the Shapley values, we were able to understand how the model uses features in individual cases to make a decision, and we could now take action to build a more robust model and prevent these misclassifications from happening.

In our next blog we will be looking at different approaches to explainability and how they compare.