Shuzheng Zhang, Baoshan Ma, Yu Liu, Yiwen Shen, Di Li, Shuxin Liu, Fengju Song
{"title":"Predicting locus-specific DNA methylation levels in cancer and paracancer tissues.","authors":"Shuzheng Zhang, Baoshan Ma, Yu Liu, Yiwen Shen, Di Li, Shuxin Liu, Fengju Song","doi":"10.2217/epi-2023-0114","DOIUrl":null,"url":null,"abstract":"<p><p><b>Aim:</b> To predict base-resolution DNA methylation in cancerous and paracancerous tissues. <b>Material & methods:</b> We collected six cancer DNA methylation datasets from The Cancer Genome Atlas and five cancer datasets from Gene Expression Omnibus and established machine learning models using paired cancerous and paracancerous tissues. Tenfold cross-validation and independent validation were performed to demonstrate the effectiveness of the proposed method. <b>Results:</b> The developed cross-tissue prediction models can substantially increase the accuracy at more than 68% of CpG sites and contribute to enhancing the statistical power of differential methylation analyses. An XGBoost model leveraging multiple correlating CpGs may elevate the prediction accuracy. <b>Conclusion:</b> This study provides a powerful tool for DNA methylation analysis and has the potential to gain new insights into cancer research from epigenetics.</p>","PeriodicalId":11959,"journal":{"name":"Epigenomics","volume":" ","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11158003/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epigenomics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2217/epi-2023-0114","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Aim: To predict base-resolution DNA methylation in cancerous and paracancerous tissues. Material & methods: We collected six cancer DNA methylation datasets from The Cancer Genome Atlas and five cancer datasets from Gene Expression Omnibus and established machine learning models using paired cancerous and paracancerous tissues. Tenfold cross-validation and independent validation were performed to demonstrate the effectiveness of the proposed method. Results: The developed cross-tissue prediction models can substantially increase the accuracy at more than 68% of CpG sites and contribute to enhancing the statistical power of differential methylation analyses. An XGBoost model leveraging multiple correlating CpGs may elevate the prediction accuracy. Conclusion: This study provides a powerful tool for DNA methylation analysis and has the potential to gain new insights into cancer research from epigenetics.
目的:预测癌症和癌旁组织中的碱基分辨率 DNA 甲基化。材料与方法我们从癌症基因组图谱(The Cancer Genome Atlas)中收集了6个癌症DNA甲基化数据集,从基因表达总库(Gene Expression Omnibus)中收集了5个癌症数据集,并使用配对的癌组织和癌旁组织建立了机器学习模型。为了证明所提方法的有效性,进行了十倍交叉验证和独立验证。结果所开发的跨组织预测模型可大幅提高68%以上CpG位点的准确性,有助于增强差异甲基化分析的统计能力。利用多个相关 CpG 的 XGBoost 模型可提高预测准确率。结论这项研究为 DNA 甲基化分析提供了一个强大的工具,有望从表观遗传学中获得癌症研究的新见解。
期刊介绍:
Epigenomics provides the forum to address the rapidly progressing research developments in this ever-expanding field; to report on the major challenges ahead and critical advances that are propelling the science forward. The journal delivers this information in concise, at-a-glance article formats – invaluable to a time constrained community.
Substantial developments in our current knowledge and understanding of genomics and epigenetics are constantly being made, yet this field is still in its infancy. Epigenomics provides a critical overview of the latest and most significant advances as they unfold and explores their potential application in the clinical setting.