Machine-Learning Algorithm Identifies Incident Stroke

March 8, 2021

By Samara Rosenfeld

Article

The algorithm can be adopted by other hospitals and health systems to identify incident stroke.

Nicholas Larson, Ph.D.

Nicholas Larson, Ph.D.

A machine-learning algorithm performs well for identifying incident stroke and for determining type of stroke.

The algorithm’s performance in a general population sample demonstrated its generalizability and potential to be adopted by other hospitals and health systems.

Nicholas Larson, Ph.D., and colleagues developed a machine learning-based phenotyping algorithm for incident stroke ascertainment based on diagnosis codes, procedure codes, and clinical concepts extracted from clinical notes using natural language processing. The predictive modeling study used observational cohort data for training and validation. An atrial fibrillation (A-fib) cohort was used to train and test the phenotyping algorithm for the date of incident stroke events. The generalizability of the algorithm was evaluated in a general population cohort.

A patient population from Minnesota made up the A-fib cohort. All healthcare-related events were extracted through the Rochester Epidemiology Project. Data included demographic information, diagnostic and procedure codes, healthcare utilization data, outpatient drug prescriptions, results of laboratory tests, and information about smoking, height, weight, and body mass index.

The algorithm aimed to identify first stroke events within a certain time frame. The team used three major data elements: clinical concepts, ICD-9 codes, and CPT codes. Different models were constructed by varying the inclusion of CPT codes and symptom-related clinical concepts in the model feature set and compared different models’ performances. Clinical concepts were identified from the major and secondary problem list in the Mayo Clinic EHR and from clinical notes from other Rochester Epidemiology Project sites using a natural language processing system.

Larson and the investigators created a data set with 9,130 confirmed visits with stroke and nonstroke labels among 1,773 patients. There were 746 stroke visits and 8,384 nonstroke visits. They included data from a randomly selected 79.98% of screened patients as a training set and the other 20.02% were retained as an independent testing set.

Phenotype models were trained using logistic regression and random forest. The team evaluated the generalizability of the model on a sample from a general population cohort of more than 71,000 patients. Those included were at least 30 years old with no prior history of cardiovascular disease. The best performing model was applied to the entire population cohort to generate incident stroke predictions. Then, 50 patients were randomly selected from those who had no stroke-related features, 50 patients were selected from those who were shown to have negative stroke predictions, and 50 patients were selected from those who were shown to have positive stroke predictions and a predicted incident stroke for evaluation.

Overall, of 4,914 patients with A-fib, 740 had validated incident stroke events. The best performing algorithm used clinical concepts, diagnosis codes, and procedure codes as features in a random forest classifier.

Among those with stroke codes in the general population sample, the best-performing model had a positive predictive value of 86% (95% CI, .74-.93) and a negative predictive value of 96%. For subtype identification, the team achieved an accuracy of 83% in the A-fib cohort and 80% in the general population sample.

The findings demonstrated incorporating structured EHR data can effectively distinguish incident stroke mentions from historical events in the clinical notes. Based on the performance of the AI among the general population cohort, the algorithm could be adopted by other institutions.

The study, “Natural Language Processing and Machine Learning for Identifying Incident Stroke From Electronic Health Records: Algorithm Development and Validation,” was published online in the Journal of Medical Internet Research.

Newsletter

Get the latest hospital leadership news and strategies with Chief Healthcare Executive, delivering expert insights on policy, innovation, and executive decision-making.

Recent Videos

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Image: Chief Healthcare Executive

Image: Ron Southwick, Chief Healthcare Executive

Related Content

Image: Navina

How AI Is cracking the code in value-based care: 5 tips for healthcare organizations | Viewpoint

Even with AI, health systems must take the right steps to achieve the value-based results they want — for themselves and their patients.

Image: Chief Healthcare Executive

Earning the right to reach out to patients | Healthy Bottom Line podcast

In the latest edition of our podcast, we talked with Ann Bilyew of WebMD Ignite about giving patients reliable information and meeting patients where they are.

Image credit: ©Syda Productions - stock.adobe.com

Cyberattackers target physician practices, specialty groups

Ransomware groups have targeted hospitals for years, but they are also going after outpatient facilities and smaller providers.

Image: American Telemedicine Association

Telehealth faces a looming deadline in Washington | Healthy Bottom Line podcast

February 12th 2025

Once again, the clock is ticking on waivers for telemedicine and hospital-at-home programs. Kyle Zebley of the American Telemedicine Association talks about the push on Congress and the White House.

Image: ISPOR

Rob Abbott, ISPOR’s CEO, wants to move the needle on health care

In an interview with Chief Healthcare Executive, he talks about focusing on patients, examining value, and the need for real-world evidence in decisions.

For hospitals, IT staffing remains a difficult challenge

For hospitals, IT staffing remains a difficult challenge

Health systems continue to face challenges in filling positions in information technology, and cybersecurity.

Related Content

Image: Navina

How AI Is cracking the code in value-based care: 5 tips for healthcare organizations | Viewpoint

Even with AI, health systems must take the right steps to achieve the value-based results they want — for themselves and their patients.

Image: Chief Healthcare Executive

Earning the right to reach out to patients | Healthy Bottom Line podcast

In the latest edition of our podcast, we talked with Ann Bilyew of WebMD Ignite about giving patients reliable information and meeting patients where they are.

Image credit: ©Syda Productions - stock.adobe.com

Cyberattackers target physician practices, specialty groups

Ransomware groups have targeted hospitals for years, but they are also going after outpatient facilities and smaller providers.

Image: American Telemedicine Association

Telehealth faces a looming deadline in Washington | Healthy Bottom Line podcast

February 12th 2025

Once again, the clock is ticking on waivers for telemedicine and hospital-at-home programs. Kyle Zebley of the American Telemedicine Association talks about the push on Congress and the White House.

Image: ISPOR

Rob Abbott, ISPOR’s CEO, wants to move the needle on health care

In an interview with Chief Healthcare Executive, he talks about focusing on patients, examining value, and the need for real-world evidence in decisions.

For hospitals, IT staffing remains a difficult challenge

For hospitals, IT staffing remains a difficult challenge

Health systems continue to face challenges in filling positions in information technology, and cybersecurity.

Terms and Conditions

Do Not Sell My Personal Information

Contact Info

2 Commerce Drive
Cranbury, NJ 08512

© 2025 MJH Life Sciences

All rights reserved.