|Articles|January 15, 2018

What Healthcare Can Learn From a Proud Data Parasite

The creator of a digital genetics database says it’s all about harmonization.

Some might call him a data parasite, but Paul Pavlidis, PhD, doesn’t mind. “It’s a slur that we now embrace,” he tells Healthcare Analytics News™. “It’s a good thing.”

He borrowed the title from a 2016 New England Journal of Medicine op-ed in which its editor-in-chief described the potential for “research parasites” to take advantage of an open data-sharing system, though forms of the label had been around before that article. So, what is a data parasite? “We don’t generate data; we just take it from other people,” says Pavlidis, a psychiatry professor at the University of British Columbia in Vancouver, Canada.

And that is a good thing: It’s researchers like him who scrutinize the work of others, ensuring information reliability and integrity, and examine data sets to identify other uses that the original investigators might have overlooked.

Data parasites may also compile disparate data and build new databases, like Pavlidis and his colleagues did when they created NeuroExpresso, a searchable, open-access, online repository of gene expression profiles for 36 types of brain cells, based on mouse data. Healthcare Analytics News™ caught up with Pavlidis late last year after he published a corresponding paper, “Cross-Laboratory Analysis of Brain Cell Type Transcriptomes with Applications to Interpretation of Bulk Tissue Data,” in the journal eNeuro.

As the conversation progressed, it kept returning to 1 theme: What would a data parasite recommend? Do data parasites have any advice for this particular sort of institution or researcher or healthcare organization? So, as healthcare stakeholders of all backgrounds grapple with ever-growing piles of data on their plates—from electronic medical records, wearables, genes, lab tests, research, biospecimens, and more—what can they learn from someone who’s just far removed from the data-gathering work to see the strengths and flaws?

Go Public

When researchers and other healthcare institutions place de-identified information in public databases, it benefits future studies, Pavlidis says. NeuroExpresso drew much of its data from the Gene Expression Omnibus run by the National Center for Biotechnology Information. This is great for investigators who are generating data and want its value to be “greater than what they write in their own publication.”

Another plus is that public databases do a certain amount of work to harmonize the data. That means that the gene expression profiles might match up, and particular genes may be comparable.

Still, it does not always work out so well. Pavlidis and his colleagues often must harmonize data that live in public registries. “The numbers might not be on the same scale. We have to normalize it,” he says. “That’s somewhat inherently an imperfect process.”

Keep It Simple

Quality control is the first step toward this goal. Too often, data parasites encounter samples that are flawed in some way. For example, a cell type might be contaminated with another cell type, Pavlidis says. It is crucial that data generators strive to keep the data simple, in that they are actually what they are supposed to be.

The same goes for digital data interfaces like NeuroExpresso. Data generators and parasites alike should build software that is simple, easy to use, and intuitive. The shiniest bell and whistle should be the inclusion of an access point to the underlying data. “That’s what I think is going to be the big win here,” Pavlidis says of NeuroExpresso’s use of that feature.

Embrace the Data Parasites

Pavlidis and his ilk are trying to improve the data situation, whether that be in terms of quality or access. He wants to be open about what exactly he is doing, providing tools and resources along the way. And he’s happy to teach data generators about his work—and how they can help improve it. “It’s something a lot of scientists are realizing adds value to their work,” he says. “We’re hoping it’s a positive thing.”

Stay ahead of the evolving healthcare landscape with expert insights on leadership, operations, policy, innovation, and workforce strategy. Subscribe to Chief Healthcare Executive today.

What Healthcare Can Learn From a Proud Data Parasite

Related Content

Changes: The hospital’s mission has gone beyond medicine

Intermountain Health plans $1.15B deal to expand presence in Idaho

Chief Healthcare Executive Roundtable: The struggle to retain staff

The workforce: Burnout, retention and where AI fits

Six ways to augment payer, pharmacy benefit manager and drug manufacturer partnerships | Viewpoint

Latest CME

Breast Cancer Tumor Board: Targeting TROP2 – Innovations in Triple-Negative Breast Cancer Treatment

Expert Guidance on Frequently Asked Questions Regarding the Use of ADCs in TNBC

Evaluating the Latest Data and Ongoing Trials for Novel ADC Approaches in TNBC

Establishing the Rationale for ADC and ICI Combinations in TNBC

Breaking Down the Rationale for Targeting TROP2 in TNBC

Dissecting Clinical Trial and Real-World Data for ADCs in TNBC

Breaking Down the Latest Clinical Data for First-line Maintenance and R/R SCLC

Cross-Disease Integration of Immunotherapy Innovations

Broadening the Frontline—Studies Informing the Use of Immunotherapy in Hepatocellular Carcinoma

Optimizing Treatment for Biliary Tract Cancers

PER Resource Center: Integrating Novel Approaches in TNBC – New Avenues for TROP2-Targeting ADCs and Beyond – Nursing

Practical Considerations and Future Directions for New Treatment Strategies in SCLC

Expert Roundtable and Panel Discussions: Current and Future Landscape of TNBC

Show Me the Data®: New and Emerging Roles for Oral SERD Therapy in the Treatment of ER+/HER2– Breast Cancer

Navigating Treatment Gaps in SCLC: Relapse, Resistance, and Need for New Options

Medical Crossfire® in Adjunctive Testing: Charting a New Course in Prostate Cancer Risk Assessment

BURST CME™ Resource Center: Integrating Novel PSMA-Directed Radioligand Approaches for Diagnosis and Management of Prostate Cancer

Radioligand Therapy 101: The Science Behind the Strategy

Ready for Radioligand Therapy? Patient Selection and Sequencing Simplified

Working Together: Overcoming Barriers to Optimize Outcomes in Patients Treated With Radioligand Therapy Through Multidisciplinary Care

Imaging Matters: Decoding PSMA PET for Better Decision-Making

A New Era of Targeted Therapy for Advanced NSCLC: Exploring Future Directions for Bispecific Antibodies and ADCs

Community Practice Connections™: Enhancing Melanoma Outcomes With Intratumoral Oncolytic Immunotherapy–Strategies for the Multidisciplinary Team

Advances in Managing EGFR-Mutant NSCLC: Applying Evidence Across the Disease Continuum

Navigating Advances in Neovascular Retinal Disease: Translating Evidence to Practice in AMD, DME, and RVO

Enhancing Prostate Cancer Outcomes – The Role of PSMA and Targeted Treatment Strategies

(CME Track) Antibody–Drug Conjugates in Oncology: The Essentials of AE Management for Better Patient Outcomes

Community Practice Connections™: Optimizing SCLC Treatment Strategies and Managing Adverse Events Across Disease Stages

Personalized Approaches in NSCLC: Early Detection, Molecular Testing, and Targeted Therapies

9th Annual School of Nursing Oncology™

Community Practice Connections™: DLL3-Targeting Bispecific Antibodies for Small Cell Lung Cancer—From Innovation to Practice

Hot Seat: How Experts Are Integrating the Latest Practice-Changing Data Into Their Breast Cancer Clinics

Cases and Conversations™: Transforming Small Cell Lung Cancer Treatment Through Emerging Evidence and Expert Insights

Biomarker Testing in HER2+ GEA: Diagnosis and Treatment Implications

Navigating the Adverse Event Landscape in HER2+ GEA Therapy

Hot Seat: Converging Lines in the Management of RAS-Altered Cancers

(CME Track) Tackling Oncologic Emergencies in Patients Treated With High-Dose Methotrexate

Cases & Conversations™: Unmasking Epithelioid Sarcoma – Enhancing Early Diagnosis and Multidisciplinary Care

Expert Illustrations & Commentaries: Translating the Science of Bispecific Antibodies in Solid Tumors – From Mechanisms to Emerging Data

SimulatEd™: A Roadmap to Personalized Care Plans and Shared Decision-Making in Low-Grade Serous Ovarian Cancer

The Rise of Novel HER2-Targeting Therapies in GEA: Mechanisms and Clinical Data

Show Me the Data™: Personalizing First-Line and Maintenance Therapy in HER2+ Metastatic Breast Cancer to Extend Survival and Elevate Quality of Life

Medical Crossfire®: The Who, When, and How of TROP2-Targeting ADCs, ICIs, and PARP inhibition in Triple-Negative Breast Cancer

Optimizing Multidisciplinary Care in TGCT

Revolutionizing TGCT Care with Multidisciplinary Perspectives and Cutting-Edge Targeted Therapies

From Frontline to Heavily Pretreated HR+/HER2- Metastatic Breast Cancer: Expert Perspectives on Optimizing the Expanding Treatment Armamentarium

Beyond Primary End Points: Digging Into Randomized and Real-World Data to Guide Challenging Treatment Decisions in HR+/HER2− Metastatic Breast Cancer

Diagnosis and Management of TGCT

Trending on Chief Healthcare Executive

Changes: The hospital’s mission has gone beyond medicine

Intermountain Health plans $1.15B deal to expand presence in Idaho

Chief Healthcare Executive Roundtable: The struggle to retain staff

The workforce: Burnout, retention and where AI fits

Six ways to augment payer, pharmacy benefit manager and drug manufacturer partnerships | Viewpoint