After re-analyzing raw gene expression data, a group of scientists at Cold Spring Harbor Laboratory in New York, have crafted a ranked list of more than 19,000 genes that act unusually during disease onset that could help aid researchers and clinicians.
The meta-analysis set to quantify the degree to which the same genes show up in studies, Jesse Gillis, Ph.D., associate professor of computational genomics at Cold Spring Harbor Laboratories, told Inside Digital Health™.
Among the key findings was that the team saw that ranking genes by their contribution to differential expression allows us to predict hit lists with high accuracy, with a mean area under the curve of receiver operating characteristics of 0.8.
The team, along with Paul Pavlidis at University of British Columbia, conducted a computational analysis of 635 data sets across about 27,000 samples and compared sets of samples from people with diseases as well as those without to compare gene activity.
The authors said to think about a psychic. When they are doing a cold reading for an audience and ask if someone has the name David, they make very probable guesses.
People become excited when David gets singled out, not thinking about how popular the name is, and Gillis and postdoctoral researcher Maggie Crow noticed that thinking this way could be problematic for studies that compare the gene activity of healthy cells to that of cells involved in disease.
So, the team found that there are genes similar to the name David that are so likely to be affected by any disease that their appearance is unsurprising, which helped them identify genes that are more unique to specific conditions.
Nearly all genes are differentially expressed at least once, with most genes recurring in about 10 percent of datasets. There was also evidence of common differential expression, with 229 genes occurring in more than 10 percent of differential expression hit lists.
CXCL8 — a chemokine involved in attracting neutrophils toward site of injury or infection — was the most extreme example, occurring in nearly 20 percent of all datasets.
“Our results have major implications for interpreting (differential expression) hit lists from meta-analysis, which are very likely to report generic (high-differential expression prior) genes, as well as for other well-characterized gene sets, such as disease biomarkers, or housekeeping genes,” the authors wrote.
Being able to quantify genes precisely is extremely valuable, according to Gillis.
With the list, clinicians and researchers have a tool that shows what gene showed up and why, which can be helpful in studies. If a gene is high on the differential expression prior rank, it shows that the gene is associated with the disease, but it is hard to see a causal relationship, like inflammatory responses, Gillis told us.
There is also a chance that new data will aid researchers in designing better experiments, discovering new drug targets and developing treatments for a vast range of diseases.
XIST is the gene that showed up the most, with DDX3Y appearing second most.
Get the best insights in healthcare analytics directly to your inbox.
Medical Genome Initiative Launched by Top Healthcare and Research Orgs
The Complicated Ethics of Gene Editing
Pinpointing Precision Medicine's Place in Cardiology