{"title":"Element-specific estimation of background mutation rates in whole cancer genomes through transfer learning.","authors":"Farideh Bahari, Reza Ahangari Cohan, Hesam Montazeri","doi":"10.1038/s41698-025-00871-3","DOIUrl":null,"url":null,"abstract":"<p><p>Mutational burden tests are essential for detecting signals of positive selection in cancer driver discovery by comparing observed mutation rates with background mutation rates (BMRs). However, accurate BMR estimation is challenging due to the diversity of mutational processes across genomes, complicating driver discovery efforts. Existing methods rely on various genomic regions and features for BMR estimation but lack a model that integrates both intergenic intervals and functional genomic elements on a comprehensive set of genomic features. Here, we introduce eMET (element-specific Mutation Estimator with boosted Trees), which employs 1372 (epi)genomic features from intergenic data and fine-tunes it with element-specific data through transfer learning. Applied to PCAWG somatic mutations, eMET significantly improves BMR accuracy and has potential to enhance driver discovery. Additionally, we provide an extensive analysis of BMR estimation, examining different machine learning models, genomic interval strategies, feature categories, and dimensionality reduction techniques.</p>","PeriodicalId":19433,"journal":{"name":"NPJ Precision Oncology","volume":"9 1","pages":"92"},"PeriodicalIF":6.8000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11953285/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Precision Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41698-025-00871-3","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Mutational burden tests are essential for detecting signals of positive selection in cancer driver discovery by comparing observed mutation rates with background mutation rates (BMRs). However, accurate BMR estimation is challenging due to the diversity of mutational processes across genomes, complicating driver discovery efforts. Existing methods rely on various genomic regions and features for BMR estimation but lack a model that integrates both intergenic intervals and functional genomic elements on a comprehensive set of genomic features. Here, we introduce eMET (element-specific Mutation Estimator with boosted Trees), which employs 1372 (epi)genomic features from intergenic data and fine-tunes it with element-specific data through transfer learning. Applied to PCAWG somatic mutations, eMET significantly improves BMR accuracy and has potential to enhance driver discovery. Additionally, we provide an extensive analysis of BMR estimation, examining different machine learning models, genomic interval strategies, feature categories, and dimensionality reduction techniques.
期刊介绍:
Online-only and open access, npj Precision Oncology is an international, peer-reviewed journal dedicated to showcasing cutting-edge scientific research in all facets of precision oncology, spanning from fundamental science to translational applications and clinical medicine.