AI Struggles Across Health Systems, Needs Wide-Range Testing

November 19, 2018

By Samara Rosenfeld

Article

In a new study, AI performed significantly worse at diagnosing pneumonia in X-ray images in different health systems.

_{Deep learning models may not perform as accurately as expected if AI in the medical space is not tested more carefully.}

Artificial intelligence (AI) performed worse at detecting pneumonia through images from different health systems than in data from a single organization, suggesting that AI must be tested carefully for performance across a wide range of populations, according to a new study.

Researchers from the Icahn School of Medicine at Mount Sinai used convolutional neural networks (CNNs) to analyze chest X-ray images to help provide a pneumonia diagnosis. The researchers used CNNs across three hospital systems — National Institutes of Health Clinical Center, Mount Sinai Hospital and Indiana University Network for Patient Care — for a simulated pneumonia screening task.

>> WATCH: Machine Learning: The Future of Medicine?

In three out of five comparisons, researchers determined that the CNNs’ performance in diagnosing diseases on X-rays from hospitals outside its own network was significantly lower than on X-rays from the original health system. But CNNs exhibited a high degree of accuracy in detecting the hospital system where an X-ray was acquired.

According to the researchers, deep learning models use too many parameters, which makes it difficult to identify specific variables driving predictions and complicates their effectiveness in healthcare.

“Our findings should give pause to those considering rapid deployment of artificial intelligence platforms without rigorously assessing their performance in real-world clinical setting reflective of where they are being deployed,” senior author Eric Oermann, M.D., instructor in neurosurgery at the Icahn School of Medicine at Mount Sinai, said in a statement.

CNN systems used for medical diagnosis need to be tailored to consider clinical questions, tested for real-world scenarios and assessed to determine how they affect accurate diagnoses, first author John Zech, a medical student at the Icahn School of Medicine at Mount Sinai, said in a statement.

CNNs’ performance in diagnosing diseases on X-rays may reflect their ability to identify disease-specific imaging findings and exploit confounding information. Additionally, CNNs’ performance may overstate their real-world performance.

Deep learning models may not perform as accurately as expected if AI in the medical space is not tested more carefully, the study found.

A total of 158,323 chest radiographs were drawn from the three participating institutions. Researchers elected to study the diagnosis of pneumonia on chest X-Rays due to its common occurrence, clinical significance and prevalence in the research community. Before computer-aided devices can be used in real-world clinical settings, they must first be able to generalize across a variety of hospital systems.

Get the best insights in healthcare analytics directly to your inbox.

Related

How AI Is Shaking Up Healthcare, Beyond Diagnostics

LeanTaaS Closes $15 M in Series C Funding

Why Data and Analytics Could See the Biggest Increase in Technology Spend in 2018

Recent Videos

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Related Content

Image: Wolters Kluwer Health

Avoiding a future where the ‘cause of death’ is an AI chatbot | Viewpoint

AI chatbots can aid in diagnosis and care management. But there are several specific issues with current mainstream AI chatbots.

Iodine Software CEO talks about AI, hospitals, and demand for results | Data Book podcast

Iodine Software CEO talks about AI, hospitals, and demand for results | Data Book podcast

William Chan, the co-founder of the healthcare technology company, discusses artificial intelligence in the latest episode of our podcast.

Image: The Clinic at Cleveland Clinic

Forget boiling the ocean and make a large impact in these clinical areas | Viewpoint

A closer look at patient experiences and preferences shows a growing preference for virtual specialty care, along with more access to higher quality providers.

MJH Life Sciences

Using voice technology to connect with patients | Data Book podcast

In the latest episode of Chief Healthcare Executive’s podcast, we talk with Freddie Feldman of Wolters Kluwer Health about patient engagement and helping patients get the care they need.

Image: Sentara Health

We established AI guardrails for our health system. Why you should, too. | Viewpoint

Sentara Health set out several key principles for the use of AI in the Virginia-based system. The guidelines emphasize safety, privacy and transparency.

Image: Ron Southwick, Chief Healthcare Executive

At HLTH this year, AI will take center stage

The health technology conference in October will be spotlighting artificial intelligence. Rich Scarfo, president of HLTH, says the focus reflects the growing importance of AI in the industry.

Related Content

Image: Wolters Kluwer Health

Avoiding a future where the ‘cause of death’ is an AI chatbot | Viewpoint

AI chatbots can aid in diagnosis and care management. But there are several specific issues with current mainstream AI chatbots.

Iodine Software CEO talks about AI, hospitals, and demand for results | Data Book podcast

Iodine Software CEO talks about AI, hospitals, and demand for results | Data Book podcast

William Chan, the co-founder of the healthcare technology company, discusses artificial intelligence in the latest episode of our podcast.

Image: The Clinic at Cleveland Clinic

Forget boiling the ocean and make a large impact in these clinical areas | Viewpoint

A closer look at patient experiences and preferences shows a growing preference for virtual specialty care, along with more access to higher quality providers.

MJH Life Sciences

Using voice technology to connect with patients | Data Book podcast

In the latest episode of Chief Healthcare Executive’s podcast, we talk with Freddie Feldman of Wolters Kluwer Health about patient engagement and helping patients get the care they need.

Image: Sentara Health

We established AI guardrails for our health system. Why you should, too. | Viewpoint

Sentara Health set out several key principles for the use of AI in the Virginia-based system. The guidelines emphasize safety, privacy and transparency.

Image: Ron Southwick, Chief Healthcare Executive

At HLTH this year, AI will take center stage

The health technology conference in October will be spotlighting artificial intelligence. Rich Scarfo, president of HLTH, says the focus reflects the growing importance of AI in the industry.

Terms and Conditions

Do Not Sell My Personal Information

Contact Info

2 Commerce Drive
Cranbury, NJ 08512

© 2025 MJH Life Sciences

All rights reserved.