Machine-Learning Model Helps Prioritize Evaluation of Patients With Undiagnosed Diseases

Samara Rosenfeld

The machine-learning model could save time and healthcare costs.

Findings of a new study suggest the use of machine learning assistance to prioritize the evaluation of patients with undiagnosed diseases is feasible and may increase the number of applications processed during a given time.

Isaac Kohane, M.D., Ph.D., and a team of investigators hope the findings create discussion around the current practices of automating referral decisions in the broader context of healthcare, which could save time and money for hospitals and health systems.

Kohane and his colleague, Hadi Amiri, Ph.D., developed computational models to effectively predict admission outcomes for applicants seeking Undiagnosed Diseases Network (UDN) evaluation. Further, they ranked the applications based on the likelihood of patient admission to the UDN.

Inclusion criteria were that the applicant should have a condition that was undiagnosed despite evaluation by a healthcare professional and had at least one objective finding pertaining the phenotype for which the application was submitted. Applications in the data set included an application form containing demographic characteristic information; an official referral letter signed by a provider summarizing the applicant’s medical problems, previous diagnoses, treatments, and medications; application submission date; application review date; and the outcome of the application.

The investigators split the data set into training (80%), validation (10%), and test (10%) categories for evaluation. They then developed a logistic regression classifier and used a grid search to optimize its hyperparameters using validation data. The classifier was trained using features obtained or extracted from application materials, including “baseline” features from patient demographic information (normalized age at time of the application, age at disease onset, disease duration, and number of prior UDN visits).

Kohane and Amiri used the confidence score of the logistic classifier to rank applications based on their likelihood of acceptance. To get the confidence score, the team conducted k-fold cross-validation experiments with the UDN data set and stored confidence scores on test instances for final ranking. They used UDN’s review session frequency and number of applications that could be reviewed in each session to measure the mean application process time from the ranked list of applications. The team considered four approaches to generate ranked lists, including the current processing order at the UDN, first-in-first-out queue, classifier ranking, and accepted-first ranking.

Overall, the accepted cohort was significantly younger (mean age at application, 19.7 years old vs 36.5 years old) and had earlier onset of disease (mean age at symptom onset, 10.8 years old vs 28.4 years old) than those who were not accepted. The duration of the disease and number of prior UDN site visits in both groups were similar. Application processing time was significantly longer for applications that were not accepted (3.29 vs 4.73 months). A large number of applications in both groups were for neurologic (47.6% and 34.1%) and musculoskeletal symptoms (12.5% and 9.6%), while few applications were for gynecologic (.2% and .2%) or toxicologic (0% and .4%) symptoms.

The investigators found the best classifier had a sensitivity of .843, a specificity of .738, and an area under the receiver operating characteristic curve of .844 for predicting admission outcomes among 1,212 accepted and 1,210 not accepted applications. The classifier was able to decrease the mean UDN processing time for accepted applications from 3.29 months to 1.05 months — a 68% improvement — by ordering applications based on their likelihood of acceptance.

The system may assist clinical evaluators to distinguish, prioritize, and accelerate admission to the UDN for patients who have undiagnosed diseases. Accelerating the process could improve the diagnostic journeys for such patients and be a model for automation of triaging or referral for other resource-constrained applications. Further, shorter turnaround time could reduce the overall cost of diagnosis because the longer it takes to accept patients, the more diagnostic routes are likely to be sought.

The study, “Machine Learning of Patient Characteristics to Predict Admission Outcomes in the Undiagnosed Diseases Network,” was published online in JAMA Network Open.