- Researchers have leveraged the power of health data exchange to create a new catalogue of so-called “cancer drivers,” altered genes that are responsible for progression of the disease. Scientists combined publicly available cancer mutation and protein structure databases to identify mutations that alter normal protein-protein interaction (PPI) interfaces.
The study, led by San Diego-based Sanford Burnham Prebys Medical Discovery Institute (SBP), was published Oct. 20 in the journal PLOS Computational Biology.
“This is the first time that three-dimensional protein features, such as PPIs, have been used to identify driver genes across large cancer datasets,” said lead author Eduard Porta-Pardo, PhD, a postdoctoral fellow at SBP. The work revealed 71 interfaces in proteins previously unrecognized as cancer drivers, he explained, representing potential new cancer predictive markers and/or drug targets.
“Our analysis also identified several driver interfaces in known cancer genes...proving that our method can find relevant cancer driver genes and that alterations in protein interfaces are a common pathogenic mechanism of cancer,” added Porta-Pardo.
The research team integrated tumor data from nearly 6,000 patients in The Cancer Genome Atlas with more than 18,000 3D protein structures from the Protein Data Bank, according to Adam Godzik, PhD, director of the Bioinformatics and Structural Biology Program at SBP. “The algorithm analyzes whether structural alterations of PPI interfaces are enriched in cancer mutations, and can therefore identify candidate driver genes,” Godzik said.
“Genes are not monolithic black boxes,” Godzik continued. “They have different regions that code for distinct protein domains that are usually responsible for different functions. It’s possible that a given protein only acts as a cancer driver when a specific region of the protein is mutated. Our method helps identify novel cancer driver genes and propose molecular hypotheses to explain how tumors apparently driven by the same gene have different behaviors.”
The project utilized an algorithm called e-Driver, which applies information on 3D structures of mutated proteins to identify specific structural features. The study focused on PPI interfaces, in which many known cancer driver genes are located. The specific goal of the systematic analysis was to find PPI interfaces enriched in cancer mutations.
The image above, credited to Porta-Pardo at SBP, shows a cancer driver known as EGFR in active formation. Red indicates mutations that destroy the PPI.
The study co-authors noted that they can currently map about 40 percent of all the mutations in known driver genes to 3D structure models. “Many of the remaining 60 percent of mutations might also be altering interactions, but we will not know that until we increase the structural coverage of the human proteome,” their report states.
The researchers said the study results “represent only the tip of the iceberg of what can be achieved by including structural data in the analysis of cancer mutation profiles.” They predict their method will improve as more cancer genomes are added to existing data repositories — thereby increasing the statistical power of the analysis — but also as structural coverage increases.
“We expect such increase to come from both new experimentally determined structures in public databases and the use of better modeling tools,” the paper states.
Cancer research and treatment options increasingly rely on data analysis for further breakthroughs. Last month, at a Capitol Hill briefing hosted by the American Society of Clinical Oncology, a patient advocate and survivor of a rare form of blood cancer noted that big data’s potential to provide insights about care remains largely untapped. He called for the application of available technology “to share and interoperate that data.”