Sharing algorithms can benefit healthcare research and society at large.
(John McDonald, PhD, photo courtesy of Georgia Tech Research Horizions)
“Despite the documented advantages of the open sharing of code, to date, the practice has been extremely limited within the field of cancer drug prediction,” a group of Georgia Tech researchers write in a new PLOS ONE article. But they’re trying to change that.
Calling its project “a gauntlet,” the team wants other researchers to take, use, modify, and expand upon a machine learning platform it built to judge cancer treatment effectiveness using genetic data.
“We feed in genomic data. We use RNA expression data. Basically, we’re just looking for correlations,” John McDonald, PhD, told Healthcare Analytics News™. McDonald is the director of Georgia Tech’s Integrated Cancer Research Center and one of the study’s lead authors. “We put expression data in and we match that to responses to chemotherapeutic drugs.”
He said his team based the algorithm off the NCI-60 cell lines, representing 9 types of cancer. The lines have been tested with more than 1,000 drugs, but researchers built predictive models for only 9 drugs so far, focusing on ovarian cancer.
“The algorithm goes through looking for correlations between various gene expression profiles, and then it will come out and say, ‘These are the most informative features in your gene expression data set in terms of correlating to the response to the drug,’” McDonald explained.
The algorithm assesses treatment accurately more than 80% of the time. He said they’re currently applying it to individual patient samples from Northside Hospital in Atlanta, Georgia.
McDonald stressed the potential for clinical significance in cases where patients fail therapy on the standard of care drugs, which for ovarian cancer are carboplatin and taxol. As many as 30% of patients don’t respond to them. “It’s basically a crapshoot at this point. There’s no logic as to which drug should be tried next,” McDonald said. He added that his algorithm could help suggest drugs for second-line therapies.
The work was only made public in the last week, so it’s too early to know if the gauntlet has any takers. Still, McDonald is surprised by some of the early findings.
Predictive models for ovarian cancer are typically built with ovarian cancer samples. Instead, he and his colleagues saw better accuracy developing a model using 9 different cancers than using only the specific type of cancer. To McDonald, that suggests it would be wise not to distinguish cancers based on their tissue of origin. Some breast cancers, for example, might be more similar to ovarian cancers than first thought, he said.
The algorithm is open-source, and both the PLOS report and the official statement encourage others to take up the work and apply genetic samples to it. The team indicated that it didn’t care if the algorithm were taken and used by someone who may profit from it, as long as cancer treatments advanced.
“We’re advertising that sharing should be what everybody does,” Georgia Tech’s Fredrik Vannberg, DPhil, another study author, said. “This can be a win for everybody, but really it’s a win for the cancer patients.”