Integrating gene mutation spectra from tumors and the general population with gene expression topological networks to identify novel cancer driver genes.
Shuangyu Yang, Dan He, Ling Li, Zhiya Lu, Shaoying Li, Tianjun Lan, Feiyi Liu, Huasong Zhang, David N Cooper, Huiying Zhao
{"title":"Integrating gene mutation spectra from tumors and the general population with gene expression topological networks to identify novel cancer driver genes.","authors":"Shuangyu Yang, Dan He, Ling Li, Zhiya Lu, Shaoying Li, Tianjun Lan, Feiyi Liu, Huasong Zhang, David N Cooper, Huiying Zhao","doi":"10.1007/s00439-025-02755-9","DOIUrl":null,"url":null,"abstract":"<p><p>Discovering cancer driver genes is critical for improving survival rates. Current methods often overlook the varying functional impacts of mutations. It is necessary to develop a method integrating mutation pathogenicity and gene expression data, enhancing the identification of novel cancer drivers. To predict cancer drivers, we have developed a framework (DGAT-cancer) that integrates the pathogenicity of somatic mutation in tumors and germline variants in the healthy population, with topological networks of gene expression in tumors, and the gene expressions in tumor and paracancerous tissues. This integration overcomes the limitations of current methods that assume a uniform impact of all mutations by leveraging a comprehensive view of mutation function within its biological context. These features were filtered by an unsupervised approach, Laplacian selection, and combined by Hotelling and Box-Cox transformations to score genes. By using gene scores as weights, Gibbs sampling was performed to identify cancer drivers. DGAT-cancer was applied to seven types of cancer cohorts, and achieved the best area under the precision-recall curve (AUPRC ranging from 0.646 to 0.862) compared to five commonly used methods (AUPRC ranging from 0.357 to 0.629). DGAT-cancer has identified 505 cancer drivers. Knockdown of the top ranked gene, EEF1A1 indicated a ~ 41-50% decrease in glioma size and improved the temozolomide sensitivity of glioma cells. By combining heterogeneous genomics and transcriptomics data, DGAT-cancer has significantly improved our ability to detect novel cancer drivers, and is an innovative approach revealing cancer therapeutic targets, thereby advancing the development of more precise and effective cancer treatments.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":"775-794"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s00439-025-02755-9","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/14 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Discovering cancer driver genes is critical for improving survival rates. Current methods often overlook the varying functional impacts of mutations. It is necessary to develop a method integrating mutation pathogenicity and gene expression data, enhancing the identification of novel cancer drivers. To predict cancer drivers, we have developed a framework (DGAT-cancer) that integrates the pathogenicity of somatic mutation in tumors and germline variants in the healthy population, with topological networks of gene expression in tumors, and the gene expressions in tumor and paracancerous tissues. This integration overcomes the limitations of current methods that assume a uniform impact of all mutations by leveraging a comprehensive view of mutation function within its biological context. These features were filtered by an unsupervised approach, Laplacian selection, and combined by Hotelling and Box-Cox transformations to score genes. By using gene scores as weights, Gibbs sampling was performed to identify cancer drivers. DGAT-cancer was applied to seven types of cancer cohorts, and achieved the best area under the precision-recall curve (AUPRC ranging from 0.646 to 0.862) compared to five commonly used methods (AUPRC ranging from 0.357 to 0.629). DGAT-cancer has identified 505 cancer drivers. Knockdown of the top ranked gene, EEF1A1 indicated a ~ 41-50% decrease in glioma size and improved the temozolomide sensitivity of glioma cells. By combining heterogeneous genomics and transcriptomics data, DGAT-cancer has significantly improved our ability to detect novel cancer drivers, and is an innovative approach revealing cancer therapeutic targets, thereby advancing the development of more precise and effective cancer treatments.
期刊介绍:
Human Genetics is a monthly journal publishing original and timely articles on all aspects of human genetics. The Journal particularly welcomes articles in the areas of Behavioral genetics, Bioinformatics, Cancer genetics and genomics, Cytogenetics, Developmental genetics, Disease association studies, Dysmorphology, ELSI (ethical, legal and social issues), Evolutionary genetics, Gene expression, Gene structure and organization, Genetics of complex diseases and epistatic interactions, Genetic epidemiology, Genome biology, Genome structure and organization, Genotype-phenotype relationships, Human Genomics, Immunogenetics and genomics, Linkage analysis and genetic mapping, Methods in Statistical Genetics, Molecular diagnostics, Mutation detection and analysis, Neurogenetics, Physical mapping and Population Genetics. Articles reporting animal models relevant to human biology or disease are also welcome. Preference will be given to those articles which address clinically relevant questions or which provide new insights into human biology.
Unless reporting entirely novel and unusual aspects of a topic, clinical case reports, cytogenetic case reports, papers on descriptive population genetics, articles dealing with the frequency of polymorphisms or additional mutations within genes in which numerous lesions have already been described, and papers that report meta-analyses of previously published datasets will normally not be accepted.
The Journal typically will not consider for publication manuscripts that report merely the isolation, map position, structure, and tissue expression profile of a gene of unknown function unless the gene is of particular interest or is a candidate gene involved in a human trait or disorder.