Medical information is more complex and less available than the web data that many algorithms were trained on, so results can be misleading.
THE CORONAVIRUS PANDEMIC has prompted countless acts of individual heroism and some astounding collective feats of science. Pharmaceutical companies used new technology to develop highly effective vaccines in record time. A new type of clinical trial has remade our understanding of what works, and doesn’t work, against Covid-19. But when the UK’s Alan Turing Institute looked for evidence of how artificial intelligence had helped with the crisis, it didn’t find much to celebrate.The institute’s report, published last year, said that AI had made little impact on the pandemic and experts faced widespread problems accessing the health data needed to use the technology without bias. It followed two surveys that reviewed hundreds of studies and found that nearly all AI tools for detecting Covid-19 symptoms were flawed. “We wanted to highlight the shining stars that show how this very exciting technology has delivered,” says Bilal Mateen, a physician and researcher who was an editor of the Turing report. “Unfortunately we couldn’t find those shining stars; we found a lot of problems.”
It’s understandable that a relatively new tool in health care, like AI, couldn’t save the day in a pandemic, but Mateen and other researchers say the failings of Covid-19 AI projects reflect a broader pattern. Despite great hopes, it’s proving difficult to improve health care by marrying data with algorithms.
Many studies using samples of past medical data have reported that algorithms can be highly accurate at specific tasks, such as finding skin cancers or predicting patient outcomes. Some are now incorporated into approved products that doctors use to watch for signs of stroke or eye disease.
But many more ideas for AI health care have not progressed beyond initial proofs of concept. Researchers warn that, for now, many studies don’t use data of adequate quantity or quality to properly test AI applications. That raises the risk of real harms from untrustworthy technology let loose in health systems. Some health care algorithms in use have proved unreliable, or biased against certain demographic groups.
That data-crunching might improve health care is not a new notion. One of the founding moments of epidemiology came in 1855, when London physician Jon Snow marked cholera cases on a map to show that it was a water-borne disease. More recently, doctors, researchers, and technologists have become excited about tapping machine learning techniques honed in tech industry projects like sorting photos or transcribing speech.
Yet conditions in tech are very different from those inside research hospitals. Companies such as Facebook can access billions of photos posted by users to improve image-recognition algorithms. Accessing health data is harder because of privacy concerns and creaky IT systems. And deploying an algorithm that will shape someone’s medical care carries higher stakes than filtering spam or targeting ads.
“We can’t take paradigms for developing AI tools that have worked in the consumer space and just port them over to the clinical space,” says Visar Berisha, an associate professor at Arizona State University. He recently published a journal article with colleagues from engineering and health departments at Arizona State warning that many health AI studies make algorithms appear more accurate than they really are because they use powerful algorithms on data sets that are too small.