|Articles|February 27, 2018

How NLP and Genomics Can Scrape Psychiatric Insights Out of Unstructured EHR Data

A team of researchers from Harvard and Brigham and Women's says that their new methodology will be made freely available to other researchers.

A common knock on electronic health records (EHRs) is that they can be difficult to mine for meaningful insights. That might have less to do with the information they contain, however, than the ways they are traditionally processed. For complex conditions like mental disorders, the problems are particularly pronounced.

A team of researchers in Boston, however, is exploring how natural language processing (NLP) and genomics to develop a solution—and according to the lead author, the software they have developed will be made freely available to other researchers.

"Many efforts to use clinical documentation in electronic health records for research aim to identify individual symptoms, like the presence or absence of psychosis," Thomas McCoy Jr., MD, of Massachusetts General Hospital and Harvard Medical School said. "My co-authors and I developed a method that instead captures symptom dimensions, or sets of symptoms.”

The team based their categories on National Institute of Mental Health Research Domain Criteria standards, and today they published 2 new studies. The first used NLP to specifically extract symptom information from the unstructured data buried within EHRs of over 3,600 adults with psychiatric hospitalizations for a range of conditions, schizophrenia, major depressive disorder, and post-traumatic stress disorder among them.

The researchers developed a list of “seed words” that appeared in between 10% and 90% of the EHRs, and for each of those terms 50 unigrams and bigrams of similar terms were developed. Unrelated or ancillary terms were then preened out, allowing the team to characterize condition severity across the cohort based on the appearance of phrases.

Traditionally, when health systems are looking to use EHRs to predict condition severity and related metrics (like length of hospital stay), they pair the data with billing information. By focusing instead on the language within the detailed physician notes in the EHRs, the researchers developed a system that could predictively correlate symptoms with length of stay and cognitive performance scores, as validated by adjusted Cox regression models.

That study, the team concluded, “shows that natural language processing can be used to efficiently and transparently score clinical notes in terms of cognitive and psychopathologic domains.”

A second study by the same set of authors tried to further that effort by applying genomics.

"The recognition that the genetic basis of psychiatric illness crosses traditional boundaries has encouraged efforts to understand psychopathology according to dimensions, rather than simply presence or absence of symptoms," McCoy said.

The group drew from the Partners Biobank program, a sequencing collaboration between Brigham and Women’s Hospital and Massachusetts General Hospital, and applied the NLP methodology developed in the earlier study to extract symptom dimensions from the population. They outlined loci based on the EHR symptom sets and went to work checking for them in the genomic records.

Four of the loci exceeded a genome-wide threshold for statistical significance. “Two of these span genes are associated with neurodevelopment (RFPL3) or neurodegeneration (PFR3),” the authors wrote. “While both are known to be brain expressed, neither has previously been strongly associated with neuropsychiatric disease, suggesting the potential utility of the approach we describe in understanding brain function in a manner that is unbiased by traditional nosology.”

Both studies were published today in Biological Psychiatry. “The ability to combine large DNA data sets with meaningful psychiatric information from the electronic health record is an important step in facilitating large scale medical genetics research in psychiatry," the editor of the journal, John Krystal, MD, said in a statement.

Citing the decision to make the software available to other researchers, McCoy said that he and his team “hope this work will enable transdiagnostic dimensional phenotypes to be used in efforts to achieve precision psychiatry.”

Related Coverage:

Lost in the CRISPR Hype, a Gene-Editing Giant Is Fighting Back

An Edible QR Code Might Advance Precision Medicine

AI Is Analyzing Faces to Aid Rare Disease Diagnosis

Subscribe Now!

Latest CME

Multimedia

Community Practice Connections™: Case Discussions in TNBC… Navigating the Latest Advances and Impact of Disparities in Care

Tiffany A. Traina, MD, FASCO; Demetria Smith-Graziani, MD, MPH

How NLP and Genomics Can Scrape Psychiatric Insights Out of Unstructured EHR Data

Newsletter

Related Content

Parker Institute for Cancer Immunotherapy aims for cures

Using AI to make sense of unstructured data

Adtalem Global Education CEO sees ‘headroom for growth’

Medical school enrollment rises, but the news isn’t all good

OHSU Health’s new CEO takes over | MED MOVES

Latest CME

Community Practice Connections™: Case Discussions in TNBC… Navigating the Latest Advances and Impact of Disparities in Care

Epithelioid Sarcoma: Applying Clinical Updates to Real Patient Cases

Collaborating Across the Continuum®: Identifying and Treating Epithelioid Sarcoma

Mastering Epithelioid Sarcoma: Enhancing Diagnostic Precision and Tailoring Treatment Strategies

Clinical Showcase™: Selecting the Best Next Steps for a Patient with Epithelioid Sarcoma

Brain Mets: Brain & Spine Metastases Research and Emerging Therapy Conference

2nd Annual Hawaii Cancer Conference

Medical Crossfire®: Bridging Evidence to Practice in AML…Updates on FLT3, IDH1/2, Maintenance, Combos, and Clinical Trials

A Breath of Strength: Managing Cancer Associated LEMS and Lung Cancer as One

Show Me the Data™: Bridging Clinical Gaps Along the Continuum From Resectable, Early Stage to Advanced Gastric/Gastroesophageal Junction Cancers

Striking the Right Nerve: Managing Cancer Associated LEMS in Lung Cancer Patients

19th Annual New York GU Cancers Congress™

Medical Crossfire®: Expert Interpretations of the Latest Data in CLL Management – Understanding the Impact of Optimal Treatment Selection on Patient Outcomes

Virtual Testing Board: Digging Deeper on Your Testing Reports to Elevate Patient Outcomes in Advanced Non–Small Cell Lung Cancer

11th Annual School of Gastrointestinal Oncology® (SOGO®)

Addressing Unmet Needs in HER2+ Metastatic BTC

Community Practice Connections™: Tailored Treatment Approaches for Older Patients With Advanced HR+/HER2– Breast Cancer

Community Practice Connections™: Optimizing Treatment Outcomes and Preserving Fertility in Premenopausal HR+ Breast Cancer

From Bench to Bedside: Paradigm Shifts in HER2+ Metastatic BTC Treatment

Proactive Adverse Event Management for HER2+ BTC Treatments

Community Practice Connections™: Empowering Interventional Radiologists in the Emerging Era of Oncolytic Immunotherapies for Melanoma

A Case-Guided Discussion on Managing Immune Thrombocytopenic Purpura (ITP)

GI Tumor Board—Applying Recent Advances in Biomarker Testing and Treatment in Metastatic Colorectal Cancer

Evolving Treatment Strategies in Pancreatic Cancer: Current Standards, Emerging Targets, and the Role of Molecular Testing

Medical Crossfire®: Precision Medicine in Glioma Treatment — Integration of Molecular Profiling to Inform Targeted Therapies

Cases and Conversations™: Sorting Through the Expanding Treatment Options for Patients with Relapsed/Refractory Multiple Myeloma

PER Tumor Board®: Applying Recent Advances to Transform the Treatment Paradigm in SCLC—Expert Perspectives on New Approvals and Emerging Strategies

Medical Crossfire®: Harnessing the Power of Modern Therapies in Newly Diagnosed Multiple Myeloma

Medical Crossfire®: Improving Patient Outcomes in Myeloproliferative Neoplasms With Novel Therapeutic Approaches

Tumor Board: Expert Insights on Managing Classical 𝘌𝘎𝘍𝘙 Mutations, 𝘌𝘎𝘍𝘙 Exon 20 Insertions, and Atypical 𝘌𝘎𝘍𝘙 Mutations in Metastatic NSCLC

Medical Crossfire®: Expert Perspectives on Targeting c-Met Overexpression and 𝘔𝘌𝘛 Genomic Alterations in NSCLC – Unveiling the Complexities of 𝘔𝘌𝘛 Dysregulation

Cases & Conversations™: Transforming AML Care—Precision Strategies, Evolving Therapies, and Clinical Insights

Medical Crossfire®: Integrating Next-Generation Endocrine Targeting Therapies to Improve Outcomes for Patients With HR+/HER2- Breast Cancer

Medical Crossfire® in Adjunctive Testing: Charting a New Course in Prostate Cancer Risk Assessment

Trending on Chief Healthcare Executive

Medical school enrollment rises, but the news isn’t all good

Adtalem Global Education CEO sees ‘headroom for growth’

Parker Institute for Cancer Immunotherapy aims for cures

Using AI to make sense of unstructured data

Securing Healthcare’s Frontlines