
Tackling the Misdiagnosis Epidemic with Human and Artificial Intelligence
Two health-tech leaders examine potentially groundbreaking human and computing efforts to stop diagnostic errors.
The 
- 5% of U.S. adult outpatients experience a diagnostic error each year.
- Diagnostic mistakes contribute to one in 10 deaths.
- About 12 million adults suffer a diagnostic mishap annually.
These errors fall into four broad categories: Missed diagnosis, misdiagnosis (coming to the wrong conclusion on what causes a patient’s symptoms), delayed diagnosis and overdiagnosis. The last problem has become more prevalent in developed countries because they have the ability to perform an endless array of tests that sometimes detect innocuous variants that don’t signal the presence of disease.
Although these errors cause a great deal of human suffering, reducing their toll is more complicated than it might appear. The first issue that needs to be tackled is figuring out how to measure the problem. There is no universally accepted metric to determine scope of the diagnostic error epidemic. Some researchers have used medical records reviews, while others rely on malpractice claims data, health insurance data, physician surveys or patient questionnaires. The flaws with all these yardsticks is they take a great deal of time to collect, and since time is money, healthcare providers are looking for a more cost-effective way to measure the incidence of diagnostic errors. Without such a cost-effective measure, it is almost impossible to gauge the effectiveness of potential solutions.
With the assistance of 
Better Ways to Measure Diagnostic Errors
One approach that is garnering attention is called the SPADE approach, which attempts to measure diagnostic mistakes by coupling individual symptoms with specific diseases. SPADE, which stands for 
One such symptom-disease pair that has been linked to misdiagnosis is dizziness and stroke. A patient may come into the emergency department complaining of dizziness, which the physician diagnoses as otitis media, only to have the patient return to the hospital a few days later with a full-blown cerebrovascular accident. Because researchers have established a clear link between the symptom and the disease, hospitals can track the number of times the coupling occurs to estimate how often their practitioners make the mistake. Then hospitals can take the necessary educational and administrative steps to help correct the problem.
Other pairs that are used in this way are headache and aneurysm, chest pain and myocardial infarction, and fainting and pulmonary embolism.
For the SPADE model to work, providers need a large data set of patient information that includes the symptom and disease occurrences. Healthcare organizations also need to capture these data points regardless of where the patients show up after the more serious event happens. If a significant number of patients with the initial benign diagnosis return to a different health system when they experience the more serious outcome disease, that would skew the results. With that in mind, SPADE is most likely to work when a provider has a very wide reach — for example, when it offers clinical care and insures its patient population. That ensures that the organization will have the follow-up administrative data needed to link symptoms and the misdiagnosed disorders that follow. The SPADE metric would also work if patient data were pulled from a regional health information exchange.
Of course, measuring the incidence of diagnostic errors is only the beginning. The next step is understanding their many causes. One National Academy of Medicine report, 
- Clinicians who ignore patient input regarding signs and symptoms
- Inadequate knowledge base among clinicians
- Incorrect interpretation of medical information (.e.g., cognitive errors and biases)
- Failure to integrate collected medical information into a plausible diagnostic hypothesis (also caused by cognitive errors and biases)
- Not properly communicating the diagnosis to patients
- Lab test errors
- Communication problems between testing facilities and clinicians
- Poorly designed clinical documentation systems, including EHRs
- Inadequate interoperability between providers
- Failure to integrate the diagnostic process into clinicians’ normal workflow
- Poor handoff procedures
- Inadequate teamwork among healthcare professionals
While the list may seem overwhelming to anyone trying to reduce the heavy toll taken by diagnostic errors, most fall into two broad categories: Systemwide issues and cognitive issues. Among the systemwide issues that urgently need attention is the poor communication that often exists between providers and patients. Addressing the problem doesn’t necessarily require the latest AI tools, but it does call for more human intelligence — and compassion. It might at first seem counterintuitive to suggest that patients can play a role in the diagnostic process. Many clinicians believe they should play a silent role. But clinicians can learn a great deal from patients, if they are willing to listen and not constantly interrupt them before they tell their entire story. About 
Poor communication between testing facilities and clinicians, another barrier to diagnostic accuracy, can be remedied with relatively low-tech solutions. Setting up an electronic system that verifies receipt of important lab results doesn’t require the latest machine-learning algorithms. Similarly, administrative staff and allied health professionals can monitor the back and forth between testing labs and physicians.
Addressing Clinicians’ Diagnostic Thinking Errors
The second broad category responsible for diagnostic mistakes — cognitive biases and errors — can distort an individual clinician’s reasoning process and lead them down the wrong path. The list of potential problems is long and includes anchoring, affective bias, availability bias and premature closure. During anchoring, a diagnostician will fixate on initial findings and stay anchored to this line of reasoning even when contrary evidence suggests it’s best to change direction. The culture of modern medicine gravitates toward this mindset because it encourages physician overconfidence in their own skill set — and because many physicians, like other leaders, believe the appearance of certainty is the best course of action. Clinicians who are swayed by their positive and negative emotional reactions to patients, on the other hand, are guilty of an affective bias. Availability bias is common among clinicians who see the same disorder over and over within a short time frame or who have done research on a specific disorder, while premature closure occurs when a practitioner is too quick to accept the first plausible explanation for all the presenting signs and symptoms.
One way to avoid these cognitive errors is for clinicians to take a more introspective approach to the diagnostic process, which is sometimes referred to as metacognition or “thinking about thinking.” It involves forcing oneself to step back and take a dispassionate look at the steps one takes during the reasoning process. Psychologists who have studied diagnosticians’ thinking patterns believe that most physicians rely on one of two approaches: Type 1, which is initiative, automatic and stereotypic thinking, and Type 2, which focuses on slow, logical, effortful, calculating reasoning. The dual system came to the public’s attention when Daniel Kahneman won the Nobel Prize in economics for his seminal work on the topic and published 
What Role Should AI Play?
Of course, even the most gifted diagnostician will make mistakes or become overwhelmed with the sheer volume of data in their workflows. As 
But AI and machine learning are successfully addressing this issue.
In 
Similarly, 
The machine-learning classifier detected 84% of patients at the highest probability threshold of having FH among patients who had been cared for at Stanford Health Care. The investigators used structured and unstructured patient data from Stanford’s EHR system to create their classification system and verified its accuracy by applying it to a different patient population, using it to flag FH patients in Geisinger Health System. That’s an important distinction to point out, as many AI projects have fallen short because they have been shown to be reliable only in a narrowly defined 
A consequence of diagnostic errors is that they often land patients back in the hospital after they have been discharged. The SPADE tool we discussed earlier is designed to help reduce that eventuality. Other researchers have studied the best way to prevent avoidable readmissions by developing more accurate statistics on who is most likely to be readmitted. Several traditional risk scores exist to help make this prediction, including LACE, HOSPITAL and Maxim/Right Care scores, none of which are especially reliable. Researchers from the University of Maryland have developed a risk score derived from EHR data, using machine learning that relies on convolutional neural networks and gradient boosting repression. The new algorithm, called the Baltimore or B score, was compared to the more traditional risk scores in three different hospitals. Each hospital was evaluated with a different version of the B score because the tool was calculated with patient data from each institution.
To compare the risk scores, 
Morgan and associates found that the B score was significantly better than all the traditional scoring systems. For example, at 48 hours after admission, the AUROC for the B score as 0.72, versus 0.63, 0.64 and 0.66 for HOSPITAL, Maxim/Right Care and modified LACE scores respectively for one hospital. Put another way, “the B score was able to identify the same number of readmitted patients while flagging 25.5% to 54.9% fewer patients.”
And that would save a hospital the resources and manpower needed to deliver specialized care to patients who probably didn’t need it because they were unlikely to be readmitted.
About the Authors
Paul Cerrato has more than 30 years of experience working in healthcare as a clinician, educator, and medical editor. He has written extensively on clinical medicine, electronic health records, protected health information security, practice management, and clinical decision support. He has served as editor of Information Week Healthcare, executive editor of Contemporary OB/GYN, senior editor of RN Magazine, and contributing writer/editor for the Yale University School of Medicine, the American Academy of Pediatrics, Information Week, Medscape, Healthcare Finance News, IMedicalapps.com, and Medpage Today. HIMSS has listed Mr. Cerrato as one of the most influential columnists in healthcare IT.
John D. Halamka, M.D., leads innovation for Beth Israel Lahey Health. Previously, he served for over 20 years as the chief information officer (CIO) at the Beth Israel Deaconess Healthcare System. He is chairman of the New England Healthcare Exchange Network (NEHEN) and a practicing emergency physician. He is also the International Healthcare Innovation professor at Harvard Medical School. As a Harvard professor, he has served the George W. Bush administration, the Obama administration and national governments throughout the world, planning their healthcare IT strategies. In his role at BIDMC, Dr. Halamka was responsible for all clinical, financial, administrative and academic information technology, serving 3,000 doctors, 12,000 employees, and 1 million patients.
Get the best insights in healthcare analytics 
More AI Insights


















































