A team of international scientists led by researchers at Sanford Burnham Prebys Medical Discovery Institute (SBP) has undertaken the first-ever comparative analysis of a newly emerging category of algorithms that mine genetic information in cancer databases by focusing on internal gene structure (subgene resolution algorithms), in contrast to classical approaches that focus on genes treated as single units. These powerful data-sifting tools are helping untangle the complexity of cancer, and find previously unidentified mutations that are important in creating cancer cells.
The study, published today in Nature Methods, reviews, classifies and describes the strengths and weaknesses of more than 20 algorithms developed by independent research groups.
“Despite the increasing availability of high-resolution genome sequences, a common assumption is to consider a gene as a single unit,” explains Adam Godzik, PhD, director of the Bioinformatics and Structural Biology Program at SBP, and senior author of the study. “However, there are a number of events, such as single site DNA substitutions and splicing variants that can occur within a gene—at the subgene level. Subgene algorithms provide a high-resolution view that can explain why different mutations in the same gene can lead to distinct phenotypes, depending on how they impact specific protein regions.
“A good example of how different subgene mutations influence cancer is the NOTCH1 gene,” says Godzik. “Mutations in certain regions of NOTCH1 cause it to act as a tumor suppressor in lung, skin and head and neck cancers. But mutations in a different region can promote chronic lymphocytic leukemia and T cell acute lymphoblastic leukemia. So it’s incorrect to assume that mutations in a gene will have the same consequences regardless of their location.”
The study researchers applied each subgene algorithm to the data from The Cancer Genome Atlas (TCGA), a large-scale data set that includes genome data from 33 different tumor types from more than 11,000 patients.
“Our goal was not to determine which algorithm works better than another, because that would depend on the question being asked,” says Eduard Porta-Pardo, PhD, a former postdoc in Godzik’s lab and first author of the paper who is now a staff scientist in the Barcelona Supercomputer Center. “Instead, we want to inform potential users about how the different assumptions behind each subgene algorithm influence the results, and how the results differ from methods that work at the whole gene level.”
“We found two important things,” says Porta-Pardo. “First, we found that the algorithms are able to reproduce the list of known cancer genes established by cancer researchers—validating the subgene approach and the link between these genes and cancer. Second, we found a number of new cancer driver genes—genes implicated in the process of oncogenesis—that were missed by whole-gene approaches.
“Finding new cancer driver genes is an important goal of cancer genome analysis,” adds Porta-Pardo. “This study should help researchers understand the advantages and drawbacks of subgene algorithms used to find new potential drug targets for cancer treatment.”
Authors on the publication include Eduard Porta-Pardo (SBP), Atanas Kamburov (Massachusetts General Hospital, Harvard Medical School, Broad Institute), David Tamborero (University of Pompeu Fabra and Institute for Biomedical Research) Tirso Pons (Spanish National Cancer Research Centre), Daniela Grases (SBP), Alfonso Valencia (Barcelona Supercomputing Centre), Nuria Lopez-Bigas (University of Pompeu Fabra, Institute for Biomedical Research, Catalan Institution for Research and Advanced Studies), Gad Getz (Massachusetts General Hospital, Broad Institute of MIT and Harvard), Adam Godzik (SBP) DOI: 10.1038/nmeth.4364 About SBP Sanford Burnham Prebys Medical Discovery Institute (SBP) is an independent nonprofit medical research organization that conducts world-class, collaborative, biological research and translates its discoveries for the benefit of patients. SBP focuses its research on cancer, immunity, neurodegeneration, metabolic disorders and rare children’s diseases. The Institute invests in talent, technology and partnerships to accelerate the translation of laboratory discoveries that will have the greatest impact on patients. Recognized for its world-class NCI-designated Cancer Center and the Conrad Prebys Center for Chemical Genomics, SBP employs about 1,100 scientists and staff in San Diego (La Jolla), Calif., and Orlando (Lake Nona), Fla. For more information, visit us at SBPdiscovery.org or on Facebook at facebook.com/SBPdiscovery and on Twitter @SBPdiscovery.