The paper is the next step in a project that’s been a key priority for Faculty’s AI Safety research team for some time now: finding new ways to make the AI models we build and deploy for customers safer, more accurate and easier to understand.
At Faculty we divide AI Safety into four pillars:
- Fairness (does your AI have implicit biases?)
- Privacy (does it leak people’s data?)
- Robustness (does it fail unexpectedly?)
- Explainability (do we know how it is working?)
This paper – which covers techniques already being used in our work with the NHS to forecast the ever-changing demand for hospital resources due to the COVID-19 pandemic – focuses on explainability.
Explaining how AI systems work is one of the most compelling pursuits within modern AI research and development. Not only is AI explainability a regulatory requirement in many applications, it’s also critical for making operational use of AI systems whenever they interact with humans. For organisations using AI, the human-AI interface starts with explainability.
The problem we face at Faculty is that many industry-standard approaches to explainability just don’t provide the accuracy or insight that organisations need to understand real-world AI models.
One approach to explainability that’s gaining traction with practitioners is based on the Nobel-prize-winning Shapley Values. This is a method derived from game theory that attributes reward to players based on how much they drove the (positive) outcome of the game. The primary benefit of Shapley Values is that they fully capture the correlations between players.
However, correlation is not causation. Indeed, when it comes to explaining AI systems, getting the correlations right might not be enough.
In our NeurIPS paper, Faculty has extended the paradigm of the Shapley Values for AI explainability to enable causal relationships obeyed by the data to be incorporated into explanations of AI Systems.
This means, for example, that AI explanations can now quantify the impact of an individual’s demographics on a decision outcome, without completely ignoring the fact that demographics like race, sex and age influence people’s education, career choices, etc, but not vice versa. Without this work, AI explainability is simply fundamentally lacking a basic capability that organisations need in order to know they’re deploying safe AI systems.
Excitingly, this work has not stopped as a research contribution. This exact technique is currently deployed to explain forecasts of COVID demand on hospitals within the NHS, so that hospital operators can intuitively understand a forecast that ultimately affects their ability to schedule elective care. The causality that must be incorporated in COVID forecasting is so simple: things in the past affect things in the future, but not vice versa. Without getting this right, the explanations are highly counter intuitive, and a non-technical hospital operator would not trust the forecast.
We’re excited to see the principles outlined in our NeurIPS paper already beginning to make AI more understandable – and therefore more useful – in society. If you’d like to read the paper in full, you can find it here, or watch our presentation of the thinking behind the paper here.