Stanford-Developed Algorithm Uses Big Data to Identify Potential Cancer Treatment Targets

Data scientists and clinical researchers collaborate on MiSL, or “Mining Synthetic Lethals,” to identify genetic duos within tumors that allow cancer to survive.

A newly published study in Nature Communications vividly paints the prowess data analytics can bring to real, viable treatments of medicine’s most challenging diseases.

Addressing a dozen different cancers, Stanford University researchers used a computer algorithm to sift through volumes of data in order to identify synthetic lethals, pairs of complimentary genes that aid in the survival of cancer cells. If one of the pair is mutated, the other will maintain and survive, but if both are impacted, the cell dies, making such pairings in cancer attractive targets for treatment.

The collaboration within Stanford included data scientists and clinical researchers alike, naming their algorithm MiSL for “Mining Synthetic Lethals.” Data was drawn from The Cancer Genome Atlas (TCGA), an National Institutes of Health-associated collaboration between the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). TCGA has produced over 2.5 petabytes of data on 33 tumor types from over 11,000 patients, a quintessential example of a genomic big data registry.

As per the researchers, MiSL is “a simple and scalable Boolean implication-based computational method that analyses mutation, copy number and gene expression data of primary tumours to identify SL partners of specific mutations in specific tumour types.” With an underlying assumption that “SL partners of a mutation will be amplified more frequently or deleted less frequently in primary tumour samples harbouring the mutation, with concordant changes in expression” the team used data from about 3,000 primary tumor samples. Ultimately, they identified over 140,000 synthetic lethal partners for 3,120 mutations. Of the mutations, 1,084 had common MiSL candidates across multiple cancer types, indicating that certain genes are prone to such relationships.

From the large base, the team whittled down the prospects through DNA analysis to identify only those displaying “a true difference in gene expression levels of the partner based on whether the first gene was mutated,” cutting the candidates from the thousands to the tens. MiSL identified candidate synthetic lethals that had been previously found by other means, which was validating, but also a host of others that had not, including a specific IDH1 gene mutation associated with leukemia and gene ACACA. All told, they found 17 potential partnerships potentially involved in this mutation that may be targetable by existing drugs that are either available or in development.

The work demonstrates that “MiSL solves two problems that are directly translatable to clinical applications: identifying novel mutation-specific SL interactions, in particular IDH1 mutation and ACACA in AML, and pinpointing predictive genetic biomarkers that can guide precise targeting of existing therapies,” according to the study.

A corresponding press release quotes on of the authors, Ravi Majeti, MD, PhD: “We have just scratched the surface of what we think we can learn with MiSL,” noting that such powerful, specific data analysis is “likely to make drug development much more efficient and quick.”