结合性染色体和常染色体标记改进的表观遗传年龄预测模型。

IF 3.5 2区 生物学 Q1 GENETICS & HEREDITY
Zhong Wan, Peter Henneman, Huub C J Hoefsloot, Ate D Kloosterman, Pernette J Verschure
{"title":"结合性染色体和常染色体标记改进的表观遗传年龄预测模型。","authors":"Zhong Wan, Peter Henneman, Huub C J Hoefsloot, Ate D Kloosterman, Pernette J Verschure","doi":"10.1186/s13072-025-00606-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Alterations in epigenetic DNA methylation (DNAm) can be used as an accurate and robust method for biological age prediction. We assessed the feasibility of incorporating sex chromosomal DNAm markers into a six autosomal DNAm CpG marker-based age prediction model, since DNAm-based prediction modeling has predominantly relied on analyzing DNAm patterns on autosomes.</p><p><strong>Results: </strong>We employed random forest regression (RFR) to construct age prediction models with publicly available DNAm Infinium 450 K microarray data of sex chromosomes from human whole blood and buffy coat samples and assessed the RFR model performance based on the root-mean squared error (RMSE) and the mean absolute deviation (MAD) of cross-validation. Four types of models were constructed consisting of DNAm probes on sex chromosomes only, on sex chromosomes and autosomes together, on sex chromosomes and/or autosomes with additional stratification by sex and/or age restriction, and reduced models comprising the top best performing sex chromosomal probes combined with six best performing autosomal probes from a previous study. Our data indicated no added predictive value of Y chromosomal DNAm markers in our best-performing prediction model, even though we acknowledged the potential of applying Y chromosomal markers for age prediction. Yet, a significantly improved accuracy of age prediction was observed using a restricted set of X chromosomal combined with the six best predicting autosomal DNAm probes. In this reduced model we noted an RMSE and MAD of 2.54 and 1.89 years, respectively. Particularly, four DNAm markers on the X chromosome exhibited a strong correlation with age, i.e., cg27064949 (DGAT2L6), cg04532200 (PLXNB3), cg01882566 (RPGR) and cg25140188 (annotated to an intergenic region).</p><p><strong>Conclusions: </strong>Our findings illustrate that an age prediction model built with a set of sex chromosomal markers combined with autosomal age-informative markers, may serve as a high accuracy model to predict chronological age and may be even competitive with commonly used model built with autosomal DNAm markers only. This study represents a step forward towards the application of epigenetic autosomal and sex chromosomal combined age prediction models for aging and forensic research. Highlights A set of age-prediction models based on DNA methylation (DNAm) markers on sex chromosomes and autosomes was constructed using random forest regression (RFR). From the total dataset containing 1291 whole blood and 547 buffy coat blood samples, 860 whole blood samples were used as training set and 481 as test set, while 365 buffy coat datasets were used as training set and 182 as test set. Cross-validation of the constructed RFR models using more than 10,000 X and 30 Y chromosomal DNAm markers from all collected blood samples, provided a root-mean squared error (RMSE) ranging from 7.70 to 14.29 years, and a mean absolute deviation (MAD) from 6.10 to 11.13 years. Models constructed using sex-stratified and age-restricted data subsets demonstrated comparable RMSE and MAD values to models constructed without stratification or restriction. Models constructed using a selected set of 37 X chromosomal and six autosomal DNAm markers exhibited a significantly improved age prediction performance with a minimum RMSE of 2.54 years and MAD of 1.89 years. A total of four X chromosomal DNAm markers were found to exhibit a significant correlation with age as indicated by a Spearman correlation coefficient of 0.50. In our data sets, Y chromosomal DNAm markers did not enhance predictive performance of our best-performing age prediction model, even though we acknowledge their recognized potential for age prediction accuracy.</p>","PeriodicalId":49253,"journal":{"name":"Epigenetics & Chromatin","volume":"18 1","pages":"45"},"PeriodicalIF":3.5000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12261677/pdf/","citationCount":"0","resultStr":"{\"title\":\"Improved epigenetic age prediction models by combining sex chromosome and autosomal markers.\",\"authors\":\"Zhong Wan, Peter Henneman, Huub C J Hoefsloot, Ate D Kloosterman, Pernette J Verschure\",\"doi\":\"10.1186/s13072-025-00606-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Alterations in epigenetic DNA methylation (DNAm) can be used as an accurate and robust method for biological age prediction. We assessed the feasibility of incorporating sex chromosomal DNAm markers into a six autosomal DNAm CpG marker-based age prediction model, since DNAm-based prediction modeling has predominantly relied on analyzing DNAm patterns on autosomes.</p><p><strong>Results: </strong>We employed random forest regression (RFR) to construct age prediction models with publicly available DNAm Infinium 450 K microarray data of sex chromosomes from human whole blood and buffy coat samples and assessed the RFR model performance based on the root-mean squared error (RMSE) and the mean absolute deviation (MAD) of cross-validation. Four types of models were constructed consisting of DNAm probes on sex chromosomes only, on sex chromosomes and autosomes together, on sex chromosomes and/or autosomes with additional stratification by sex and/or age restriction, and reduced models comprising the top best performing sex chromosomal probes combined with six best performing autosomal probes from a previous study. Our data indicated no added predictive value of Y chromosomal DNAm markers in our best-performing prediction model, even though we acknowledged the potential of applying Y chromosomal markers for age prediction. Yet, a significantly improved accuracy of age prediction was observed using a restricted set of X chromosomal combined with the six best predicting autosomal DNAm probes. In this reduced model we noted an RMSE and MAD of 2.54 and 1.89 years, respectively. Particularly, four DNAm markers on the X chromosome exhibited a strong correlation with age, i.e., cg27064949 (DGAT2L6), cg04532200 (PLXNB3), cg01882566 (RPGR) and cg25140188 (annotated to an intergenic region).</p><p><strong>Conclusions: </strong>Our findings illustrate that an age prediction model built with a set of sex chromosomal markers combined with autosomal age-informative markers, may serve as a high accuracy model to predict chronological age and may be even competitive with commonly used model built with autosomal DNAm markers only. This study represents a step forward towards the application of epigenetic autosomal and sex chromosomal combined age prediction models for aging and forensic research. Highlights A set of age-prediction models based on DNA methylation (DNAm) markers on sex chromosomes and autosomes was constructed using random forest regression (RFR). From the total dataset containing 1291 whole blood and 547 buffy coat blood samples, 860 whole blood samples were used as training set and 481 as test set, while 365 buffy coat datasets were used as training set and 182 as test set. Cross-validation of the constructed RFR models using more than 10,000 X and 30 Y chromosomal DNAm markers from all collected blood samples, provided a root-mean squared error (RMSE) ranging from 7.70 to 14.29 years, and a mean absolute deviation (MAD) from 6.10 to 11.13 years. Models constructed using sex-stratified and age-restricted data subsets demonstrated comparable RMSE and MAD values to models constructed without stratification or restriction. Models constructed using a selected set of 37 X chromosomal and six autosomal DNAm markers exhibited a significantly improved age prediction performance with a minimum RMSE of 2.54 years and MAD of 1.89 years. A total of four X chromosomal DNAm markers were found to exhibit a significant correlation with age as indicated by a Spearman correlation coefficient of 0.50. In our data sets, Y chromosomal DNAm markers did not enhance predictive performance of our best-performing age prediction model, even though we acknowledge their recognized potential for age prediction accuracy.</p>\",\"PeriodicalId\":49253,\"journal\":{\"name\":\"Epigenetics & Chromatin\",\"volume\":\"18 1\",\"pages\":\"45\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12261677/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Epigenetics & Chromatin\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13072-025-00606-5\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epigenetics & Chromatin","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13072-025-00606-5","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

背景:表观遗传DNA甲基化(DNAm)的改变可以作为一种准确而可靠的生物年龄预测方法。我们评估了将性染色体DNAm标记纳入6常染色体DNAm CpG标记的年龄预测模型的可行性,因为基于DNAm的预测模型主要依赖于分析常染色体上的DNAm模式。结果:采用随机森林回归(RFR)方法构建年龄预测模型,并基于交叉验证的均方根误差(RMSE)和平均绝对偏差(MAD)评估RFR模型的性能。构建了四种模型,包括仅在性染色体上的dna探针、在性染色体和常染色体上的dna探针、在性染色体和/或常染色体上的dna探针,以及根据性别和/或年龄限制进行额外分层的dna探针,以及由性能最佳的性染色体探针与先前研究中六个性能最佳的常染色体探针组成的简化模型。我们的数据表明,在我们最好的预测模型中,Y染色体DNAm标记没有增加预测价值,尽管我们承认应用Y染色体标记进行年龄预测的潜力。然而,使用一组有限的X染色体与六种最好的常染色体dna预测探针相结合,可以显著提高年龄预测的准确性。在这个简化模型中,我们注意到RMSE和MAD分别为2.54和1.89年。其中,X染色体上的4个DNAm标记cg27064949 (DGAT2L6)、cg04532200 (PLXNB3)、cg01882566 (RPGR)和cg25140188(注释到一个基因间区)与年龄表现出较强的相关性。结论:我们的研究结果表明,用一组性染色体标记结合常染色体年龄信息标记建立的年龄预测模型可以作为预测实足年龄的高精度模型,甚至可以与仅用常染色体dna标记建立的常用模型相竞争。本研究代表了表观遗传常染色体和性染色体联合年龄预测模型在衰老和法医研究中的应用向前迈进了一步。利用随机森林回归(RFR)技术构建了一套基于性染色体和常染色体DNA甲基化(DNAm)标记的年龄预测模型。在包含1291个全血样本和547个白毛样本的数据集中,860个全血样本作为训练集,481个作为测试集,365个白毛样本作为训练集,182个作为测试集。从所有收集的血液样本中使用超过10,000个X和30个Y染色体DNAm标记对构建的RFR模型进行交叉验证,提供了均方根误差(RMSE)范围为7.70至14.29年,平均绝对偏差(MAD)范围为6.10至11.13年。使用性别分层和年龄限制数据子集构建的模型与没有分层或限制的模型显示出可比较的RMSE和MAD值。使用37个X染色体和6个常染色体DNAm标记构建的模型显示出显著提高的年龄预测性能,最小RMSE为2.54年,MAD为1.89年。共有4个X染色体DNAm标记与年龄有显著相关性,Spearman相关系数为0.50。在我们的数据集中,Y染色体DNAm标记并没有提高我们表现最好的年龄预测模型的预测性能,尽管我们承认它们在年龄预测准确性方面的公认潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improved epigenetic age prediction models by combining sex chromosome and autosomal markers.

Background: Alterations in epigenetic DNA methylation (DNAm) can be used as an accurate and robust method for biological age prediction. We assessed the feasibility of incorporating sex chromosomal DNAm markers into a six autosomal DNAm CpG marker-based age prediction model, since DNAm-based prediction modeling has predominantly relied on analyzing DNAm patterns on autosomes.

Results: We employed random forest regression (RFR) to construct age prediction models with publicly available DNAm Infinium 450 K microarray data of sex chromosomes from human whole blood and buffy coat samples and assessed the RFR model performance based on the root-mean squared error (RMSE) and the mean absolute deviation (MAD) of cross-validation. Four types of models were constructed consisting of DNAm probes on sex chromosomes only, on sex chromosomes and autosomes together, on sex chromosomes and/or autosomes with additional stratification by sex and/or age restriction, and reduced models comprising the top best performing sex chromosomal probes combined with six best performing autosomal probes from a previous study. Our data indicated no added predictive value of Y chromosomal DNAm markers in our best-performing prediction model, even though we acknowledged the potential of applying Y chromosomal markers for age prediction. Yet, a significantly improved accuracy of age prediction was observed using a restricted set of X chromosomal combined with the six best predicting autosomal DNAm probes. In this reduced model we noted an RMSE and MAD of 2.54 and 1.89 years, respectively. Particularly, four DNAm markers on the X chromosome exhibited a strong correlation with age, i.e., cg27064949 (DGAT2L6), cg04532200 (PLXNB3), cg01882566 (RPGR) and cg25140188 (annotated to an intergenic region).

Conclusions: Our findings illustrate that an age prediction model built with a set of sex chromosomal markers combined with autosomal age-informative markers, may serve as a high accuracy model to predict chronological age and may be even competitive with commonly used model built with autosomal DNAm markers only. This study represents a step forward towards the application of epigenetic autosomal and sex chromosomal combined age prediction models for aging and forensic research. Highlights A set of age-prediction models based on DNA methylation (DNAm) markers on sex chromosomes and autosomes was constructed using random forest regression (RFR). From the total dataset containing 1291 whole blood and 547 buffy coat blood samples, 860 whole blood samples were used as training set and 481 as test set, while 365 buffy coat datasets were used as training set and 182 as test set. Cross-validation of the constructed RFR models using more than 10,000 X and 30 Y chromosomal DNAm markers from all collected blood samples, provided a root-mean squared error (RMSE) ranging from 7.70 to 14.29 years, and a mean absolute deviation (MAD) from 6.10 to 11.13 years. Models constructed using sex-stratified and age-restricted data subsets demonstrated comparable RMSE and MAD values to models constructed without stratification or restriction. Models constructed using a selected set of 37 X chromosomal and six autosomal DNAm markers exhibited a significantly improved age prediction performance with a minimum RMSE of 2.54 years and MAD of 1.89 years. A total of four X chromosomal DNAm markers were found to exhibit a significant correlation with age as indicated by a Spearman correlation coefficient of 0.50. In our data sets, Y chromosomal DNAm markers did not enhance predictive performance of our best-performing age prediction model, even though we acknowledge their recognized potential for age prediction accuracy.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Epigenetics & Chromatin
Epigenetics & Chromatin GENETICS & HEREDITY-
CiteScore
7.00
自引率
0.00%
发文量
35
审稿时长
1 months
期刊介绍: Epigenetics & Chromatin is a peer-reviewed, open access, online journal that publishes research, and reviews, providing novel insights into epigenetic inheritance and chromatin-based interactions. The journal aims to understand how gene and chromosomal elements are regulated and their activities maintained during processes such as cell division, differentiation and environmental alteration.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信