Lesson 02
NHS
When the Covid pandemic hit in 2020, leaders in government and the NHS had to make decisions with terrifying implications, in an unprecedented situation, with data systems that had never been designed for a national health crisis. With infection rates soaring, AI was able to fill in the gaps and let decision makers get ahead of the curve.
They knew it was bad when they were allowed to keep their phones.
Normally, any visitor who walks through the famous black door of 10 Downing Street is asked to leave their phone in a rack by the entrance. But this was very far from normal times. It was March 2020, and Covid-19 had just locked down the country. In those early days, mortality predictions ran into almost unthinkable numbers. And the ultimate responsibility for stopping that happening - for making the decisions that would contain the pandemic and save millions of lives - rested with the people that Faculty were going to see.
Marc Warner and Andy Brookes, two of Faculty’s co-founders, had come to Downing Street to talk about using AI in the Covid response. What they didn’t know, as they walked up the famous stairs past the portraits of former prime ministers, was that the government was basing its decisions on a data-gathering system that would be recognisable to the bewhiskered, top-hatted Edwardian gentlemen in the paintings.
Every night, millions of people watched the Prime Minister and his science advisors present the latest statistics on TV. But as the Prime Minister called ‘Next slide, please’, behind the scenes the data feeding the presentation was coming in from hospitals all over the country on scraps of paper. The CEO of the NHS would read them out for aides to scrawl on a whiteboard, and then the country’s top scientists would use their iPhone calculators to project the likely trajectory of future cases.
Decision-making structures for a national crisis
The problem wasn’t that the NHS was stuck in a previous century: it wasn’t. But neither its decision-making structures nor its data flows had been designed for a national crisis. ‘In normal times, no country would look at having this type of centralised capability,’ says Lord Simon Stevens, the Chief Executive of the NHS at the time. ‘You don’t run your national health system as one big hospital. Obviously, in a pandemic, that may need to change.’
In other words, the NHS is designed to be operationally independent from government, and to provide localised services. Decisions are delegated down to trusts, hospitals and GPs’ surgeries across England. But when Covid hit, the need for clear, effective and centralised decision-making led to the creation of NHS Gold Command, a committee led by Professor Sir Keith Willett.
They met every day at 5.30pm at the NHS headquarters at Skipton House, a brown-marble-and-glass block that looms over Elephant and Castle tube station in south London. At the start of the pandemic, the normally-bustling building was eerily quiet, apart from a few executives and some army personnel who had moved in. But at 5.30, the building would echo with the sound of raised voices in heated discussion in the conference room on the top floor. Where should they send vital supplies like PPE, oxygen, or ventilators? Should patients be transferred to quieter hospitals? If more beds needed to be urgently freed up, should they cancel long-planned elective procedures, like cancer care or hip operations?
Gold Command brought together senior managers from across England, representing two hundred acute hospital trusts. Every one of the people there had dedicated their lives to serving patients: now they were dealing with a once-in-a-century pandemic, fighting for their share of gravely limited supplies, knowing that the whole health system might be overwhelmed any day. The cost of getting their decisions wrong was horrific; but even getting them right would have profound consequences for patients, staff and the wider public.
And all these decisions were being made with the sort of imperfect information Marc and Andy had seen firsthand in Downing Street. The data landscape was fragmented and fraught with inaccuracies. Overstretched frontline staff, forced to choose between spending their time on patient care or data entry, were of course choosing the patients - but that lowered data quality at exactly the moment when the system needed it most. NHS analysts found themselves desperately trying to extract insight from thousands of spreadsheets, many of which were being manually updated and frantically e-mailed around to be combined with other data before being presented to decision makers.
‘I always remember the day I was called upstairs,’ says Ming Tang, the Chief Data Analytics Officer for NHS England. ‘Chris Whitty [the Chief Medical Officer for England] told us, “We need an infrastructure, a data store that brings everything together and then makes that data available to share across researchers. We need to know the state of the pandemic, and we need to be able to link that to health data to make sure that we know where the treatments need to be.”’
As Prime Minister Boris Johnson lay in hospital having succumbed to Covid himself, Faculty worked with Ming’s team and its technology partners to engineer a properly robust data infrastructure. Out went scraps of paper, iPhone calculators and email threads clogged with spreadsheets. In came a streamlined, real-time data pipeline underpinned by key data flows including positive case numbers, NHS 111 call volumes, citizen mobility data from mobile phone providers, and even genetic material sampled from sewage wastewater. This pipeline fed dashboards for each hospital site, which could then be aggregated for decision makers at trust, system, regional or national level.
‘And within relatively short order,’ Lord Stevens recalls, ‘within about a week to ten days of deciding that we needed a centralised dashboard, we were able to assemble one. And I think we got there actually faster than most of the European countries.’
When Boris Johnson recovered, Faculty were able to demonstrate the full dashboard to him - although the data revolution hadn’t quite swept through Downing Street. A screen had to be rolled into the Cabinet room specially for the occasion. It showed a level of detail and insight that decision makers - from local NHS leaders all the way up to the Prime Minister - had simply never had before. ‘The Dashboard was so crucial,’ Johnson recalled in his evidence to the Covid enquiry, ‘that the 9.15 meetings [the government’s daily ministerial strategy meetings] were later called the Dashboard meetings.’
But the pandemic wasn’t going away. Clear, reliable data was a huge step forward, but it was only a start. Whether a number was written on a scrap of paper, or flashed up on a real-time dashboard, it still only told you what had happened. What the decision makers really needed, as they fought to get ahead of the next waves of the crisis, was guidance into how their decisions might play out in the future.
They needed the numbers to tell them what was going to happen next.
Enter the Early Warning System
This wasn’t a new concept. NHS analysts had already tried to model future outcomes, and concluded it was impossible: certainly at the level of granularity that the NHS needed to make operationally useful decisions on a hospital-by-hospital level. The data was patchy, and varied in quality across the country. With over two hundred large hospitals in England, it seemed an insurmountable challenge.
But the NHS team were open-minded, and with the stakes so high they agreed it was worth another attempt. A team of Faculty’s top executives - including the CEO, CTO, Director of Health and Director of Data Science - decamped to the unused office space at Skipton House to be as close as possible to their NHS counterparts. Eventually the team swelled to some 20 people, almost a fifth of the young company’s workforce.
‘We created joint teams who worked on this,’ says Ming, ‘and those teams were fantastic in terms of helping us create the data science necessary. Faculty were very hands on, and they just rolled their sleeves up. We felt like one team. And that was really an uplifting capability for us.’
But time was against them. An exhausted country had emerged out of lockdown in July 2020, but as summer turned to autumn and cases started to rise again, it was obvious that the pandemic was gathering steam. New vaccines offered hope, but even on the most optimistic timescales they were months away from making a difference. As talk turned to ‘circuit-breaker lockdowns’ and ‘tiers’, it became clear that the executives in Skipton House would once again be making hard choices.
Testimonials
“The Faculty team were high-calibre, engaged, and flexible. They understood what the use case was that we were looking to develop, and worked with us to continually improve it... It was a distinctive contribution that was not made by anybody else to that particular problem that we needed to resolve.”
In the end, a technique known as Bayesian hierarchical modelling turned out to be an unlikely, unsung hero of the pandemic. Even in areas where there was almost no data available, it allowed Faculty to build a compound model, named the Early Warning System (EWS), that provided a sensible forecast by sharing information from nearby hospitals with similar characteristics. That approach dealt with both the inherent uncertainty in the data, and the challenge of trying to predict complex outcomes. Now the NHS could look three weeks ahead to see the bed capacity each hospital was forecast to have, where it risked running out and where patients or resources might need to be transferred.
‘There were a lot of models being created across the system predicting where the virus was going and the rate of infection,’ says Ming. ‘But for us the focus was how operationally we would respond as an NHS, and so the model we created was much more important for forecasting beds, forecasting the likely impact of our staffing, forecasting which region would need to be most prepared. And then as we got the vaccine, that became really important because that then helped us identify where to put the vaccine next.’
For the first time, every level of the system was operating from the same page: from hospital managers making choices about how to allocate resources on their wards, right up to decisions being taken in Whitehall and Downing Street.
A model explainable by default
But as these people looked at the data, could they trust the forecasts they were being given? When you’re responsible for making decisions of this magnitude, it’s not enough to be told what the computer says: you have to understand why. How did the algorithm reach its conclusions, and how confident can you be in what it’s telling you? What struck the Faculty team again and again, as they worked on the project, was just how urgently their NHS counterparts needed to understand what the EWS was telling them. Why does it think that this hospital is going to run out of beds? What information is it basing that on? How confident should I be?
These are essential questions. They inform the basis of good decision-making, and any technology used for decision support needs to be able to answer them convincingly. After all, even with the most sophisticated AI model, it’s humans who are ultimately the ones who make the big calls - and are held accountable for the outcomes. The AI is there to help them make the best possible choices. Which means that the technology has to be designed from the ground up to support humans and to keep them firmly in control. Most of all, it has to earn their confidence.
‘We found there were three key ingredients to making the model trustworthy,’ recalls Myles Kirby, then the Healthcare business unit director for Faculty, who worked on the project team. First, there was what they dubbed the ‘decision-centric’ approach. ‘What a lot of analytical technology gets wrong,’ says Myles, ‘is it throws as many charts and numbers as possible at the user, and that’s counterproductive. It overwhelms them, and distracts them with reams of data they don’t need.’ In contrast, the ‘decision-centric’ approach takes as its starting point the specific decision a user needs to make, and then identifies the precise set of analyses they need to make it better. If a particular analysis doesn’t help the decision, it doesn’t get included. The system is parsimonious by design - and, by design, it forces AI systems to be built in ways that serve the unique needs of users as decision-makers.
Secondly, Faculty built the technology robustly, so it could constantly be tested against actual outcomes. In effect, the users could ‘rewind time’: review what the model said at the point a decision was made, and then compare it to what actually happened. Crucially, the objective here wasn’t to maximise confidence in the EWS, but to calibrate it. By being able to compare forecasts against actual decisions and outcomes, the NHS users could understand the right level of confidence to place in the technology, neither slavishly deferential nor unnecessarily sceptical. After all, even the most accurate forecasts - like people - are imperfect. Knowing how much you can rely on them builds trust.
And once you understand why the model’s telling you what it is, you get new insights into the way things are changing that helps you decide how to intervene. One of the most interesting examples of this came when the EWS - and other models - predicted a Covid spike in a particular city in the Midlands.
‘People in government were looking at all these models,’ remembers Faculty’s John Mansir, who was working as a Senior Data Scientist at the time. ‘As soon as they saw this uptick, they thought they’d need a localised lockdown. But when we dug into the reasons why the model was predicting a spike, we saw the increased cases were all confined to a particular hospital in that city. There were none of the broader indicators that would imply the spike was spreading through the community. We suggested that the NHS investigate within-hospital transmission of the virus first, rather than assuming it was prevalent in the wider community, and in fact that turned out to be the correct diagnosis.’ The right decisions were made, and the region was saved from a costly lockdown.
Faculty’s model outputs informed the allocation of over a billion pieces of PPE, facilitated the strategic transfer of critically ill patients across the country, and helped government leaders decide whether hospitals, towns and cities were opened up or locked down.
Adopting a ‘decision-centric’ approach
Faculty’s Early Warning System quickly became the analytical centrepiece of the decision making process. The decisions were still big, the stakes just as high, but Skipton House was a quieter place. When the ‘Kent’ Covid variant (later renamed the Alpha variant) ran rampant in January 2021, and London finally ran out of intensive care beds, the model was able to advise leaders where critically ill patients should be transferred by helicopter, based not only on where capacity was that day, but where it would be in three weeks’ time and where the wave was likely to hit next. Even when SPI-M, the government’s official modelling group, was forced to stop their work because the uncertainties had got too large, the Faculty model kept going.
Faculty’s model outputs informed the allocation of over a billion pieces of PPE, facilitated the strategic transfer of critically ill patients across the country, and helped government leaders decide whether hospitals, towns and cities were opened up or locked down. In 2020 and 2021, these were matters of life and death, health and livelihood, for the whole UK population.
‘The Faculty team were high-calibre, engaged, and flexible. They understood what the use case was that we were looking to develop, and worked with us to continually improve it,’ says Lord Stevens. ‘It was a distinctive contribution that was not made by anybody else to that particular problem that we needed to resolve.’
The model wasn’t making decisions, and it wasn’t offering infallible predictions. The reason it worked so well was because it had been built first and foremost to be decision-centric, to give officials and managers no more than they needed. It had been designed in such a way that users could learn how much to trust it, and so that they could interrogate how it had reached its conclusions. It was neither a crutch nor a replacement for humans using their judgement. It was a tool - but a tool unlike any other. Used correctly, it gave decision-makers the insight they needed to totally transform the speed, quality and execution of their decision-making - just when they needed it most.
‘What was really valuable about the model was that we created a process around it,’ says Ming. ‘Every day we'd bring the emergency team that were actually dealing with the pandemic together with the data scientists, triangulating that information. And no model is ever perfect, but actually having a model and the gut feel and the experience in the room together to discuss it, we came up with a game plan that everyone was comfortable with.
‘We built consensus around data,’ she adds, ‘which was really powerful, because it’s the human and the data interaction that actually comes out with the best kinds of results.’
One day, in the later stages of the pandemic, Marc entered Downing Street for another meeting. A security guard stopped him, pointing to a telltale rectangular bulge in Marc’s hip pocket. Embarrassingly, Mark had left the torch on, so a light glowed through the fabric of his trousers.
‘I’m afraid you’ll have to leave that at the door, sir,’ the guard said politely.
Marc put his phone in the rack.
The lesson in summary
AI is technology for human decision makers.
- All software should be built around the user. The user for intelligent software is typically a decision maker. Focus AI on the places in which improvements in the speed, quality and execution of decision-making will improve business performance.
- Where a decision is important, human decision makers should remain in control and accountable. AI is there to support them, not replace them.
- Drowning people in data and dashboards doesn’t help their decision-making. Instead, you need to be precise about exactly how the technology you implement will enhance their decision-making, and be parsimonious about giving them that and only that. At Faculty we use the Decision Loop methodology to make sure that solutions are carefully scoped to achieve this.
- Decision makers need to be able to judge how far to trust AI systems. Models must have the requisite level of explainability, so that users can see why it predicts what it does. Visibility of how accurately the model made historic predictions can also help calibrate how much weight to place on a model output.
- Interactive systems provide better decision support than passive dashboards. If model predictions are explainable, then decision makers can understand the cause and effect relationships at play in a given situation. And allowing them to see how outputs vary when inputs and assumptions change, means they can test the outcomes of different choices before they make them.
Did you enjoy this story? There are nine others just like it, told from the perspectives of nine of our other inspiring customers, in the full book 'Ten Lessons From Ten Years of Applied AI'. Just leave your details below to get instant access to your copy of the book.
With contributions from:
Ten Lessons From Ten Years of Applied AI
Download the eBook
Get instant access to ten examples of AI solving the world's biggest challenges, told through the stories of ten of our most brilliant customers.