In early 2021, Faculty collaborated with NHSX to create the National COVID-19 Chest Imaging Database (NCCID), a national medical imaging database that accelerated research into artificial intelligence (AI) models that could detect COVID-19. Healthcare leaders saw the potential of AI and wanted to know how AI models should be used, validated and regulated for wider use in clinical practice.
Faculty led a consortium which pioneered a model validation process to test the performance of AI models, enabling the NHS to adopt safer, more accurate AI in the future.
We worked with NHSX (now part of the NHS Transformation Directorate) and led a consortium of academic research groups to validate four AI models. The consortium included:
- the Scientific Computing team at the Royal Surrey Foundation Trust,
- Health Informatics Centre at the University of Dundee,
- Statisticians from Queen Mary’s University, and
- radiologists at the British Society of Thoracic Imaging.
During the pandemic, hundreds of AI models were trained to diagnose COVID-19 from medical images. However, many of these models were not fit for purpose and suffered from lack of reproducibility and robustness, as well as being potentially biased.
Following the onset of the COVID-19 pandemic, the use of AI for medical diagnosis continued to accelerate rapidly. Data from the NCCID played a key role in this. Researchers and developers were able to access over 50,000 chest scans from suspected COVID-19 patients through the NCCID’s database. Databases like the NCCID were used to train AI models to detect COVID-19 from chest scans.
NHSX wanted to support the adoption of safe AI but were unable to determine the safety and accuracy of the AI models without rigorous external validation processes. The challenge was how to develop and implement a validation process to assess the AI model’s performance, paving the way for safer AI adoption in the NHS.
The NCCID Validation Programme defined a novel model validation process to assess the performance of AI models in diagnosing COVID-19, ensuring new AI tools were both safe and effective across the population.
We put in place an external validation process for four AI models. We ran the models against the NCCID’s validation dataset, putting each model’s results through several rigorous statistical tests to determine model performance.
The statistical tests calculated how accurately the models detected positive and negative COVID-19 cases from medical images. They also assessed how the models would perform for sub-groups (e.g. performance given age, ethnicity and gender; and ethnicity given gender or age) and how robust the models would perform in response to changes in the data (e.g. patients with other medical conditions).
We shared guidance on the validation process and created open-source tools to help other healthcare data scientists validate their own models. Data science teams in private companies, start-ups and research groups can use these tools to improve the performance and safety of their models.
We also built an interactive online tool for developers to easily determine the optimal operating point for their model depending on the prevalence of COVID-19 in the community. This ensures they can adjust their model’s threshold so it can detect as many positive cases as possible (low false negative rate) while having a low false positive rate.
Our model validation process has paved the way for safer AI adoption in healthcare, helping the NHS deploy effective and safe models in the future.
Dominic Cushnan, Head of AI Imaging at NHSX, said: “This external validation process is critical, especially given the lack of model validation of AI models used in radiology. Faculty and the consortium have opened up vast opportunities not just for future AI adoption in the NHS, but for developers to build better performing and safer AI models, too.”
Our rigorous validation and testing procedures have implemented a novel process to test that AI models adopted are safe, robust and accurate in diagnosing COVID-19 – while protecting developers intellectual property. Unfair and biased models can lead to inconsistent levels of care, a serious problem in these critical circumstances. Introducing new standards for AI safety have never been more important.
Our work will also help refine the future NHS response to the COVID-19 pandemic and ensure future AI tools are always adopted safely. Outside of the NHS, our validation process has helped guide the use of AI in medical diagnosis and inform new approaches to the international governance of AI in healthcare.
research groups are building AI models using the NCCID
hospital trusts supply data for the NCCID