Stanford Doctors Urge Realism Towards Machine Learning

"Whether such artificial-intelligence systems are ‘smarter’ than human practitioners makes for a stimulating debate—but is largely irrelevant," they write.

“Whether such artificial-intelligence systems are ‘smarter’ than human practitioners makes for a stimulating debate—but is largely irrelevant,” write researchers Jonathan H. Chen, MD, PhD, and Steven M. Asch, MD, MPH, in a new Perspective for the New England Journal of Medicine.

Both authors are Stanford faculty, Asch in General Medicine and Chen in Medical Informatics. Their Perspective piece in no way overlooks the potential of big data and machine learning for medicine, but rather ticks off a series of potential sticking points that often go unmentioned in what they call the “hype cycle.”

Given the constantly evolving nature of medical science, the two authors believe that the big data efforts in medicine will always be after a moving target. They point to the quiet failure of Google Flu, which the company had believed capable of tracking flu outbreaks and severity based on regional searches. “Forecasting an annual event on the basis of 1 year of data is effectively using only a single data point and thus runs into fundamental time-series problems,” they write.

Since the future won’t always look like the past, they argue that there are “diminishing returns” to aggregating massive piles of data. Clinical data has a half-life, after all, which they place at only about 4 months. Rather than putting an unhittable moving target in the crosshairs, perhaps it is better for science to just get closer in the meantime.

Asch and Chen encourage more direct, yes-or-no inquiries for data algorithms as opposed to expecting precise predictions. By posing a question not as when will you have a heart attack? but rather will you have a heart attack in the next 10 years? they argue, “predictive algorithms can operate as diagnostic screening tests to stratify patient populations by risk and inform discrete decision making.

Their ultimate aim, they indicate, is to encourage the use of predictive analytics for better care in the ways that it already can without letting the hype get too far ahead of itself (as it almost always does). They pose that current hype “at the peak of inflated expectations” and wish to avoid a dive into the “trough of disillusionment,” by acknowledging limitations alongside capabilities.

“Let our benchmark be the real-world standards of care whereby doctors grossly misestimate the positive predictive value of screening for rare diagnoses, routinely overestimate patient life expectancy by a factor of 3, and deliver care of widely varied intensity in the last 6 months of life,” they decree.