Zhong Wan, Peter Henneman, Huub C J Hoefsloot, Ate D Kloosterman, Pernette J Verschure
{"title":"结合性染色体和常染色体标记改进的表观遗传年龄预测模型。","authors":"Zhong Wan, Peter Henneman, Huub C J Hoefsloot, Ate D Kloosterman, Pernette J Verschure","doi":"10.1186/s13072-025-00606-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Alterations in epigenetic DNA methylation (DNAm) can be used as an accurate and robust method for biological age prediction. We assessed the feasibility of incorporating sex chromosomal DNAm markers into a six autosomal DNAm CpG marker-based age prediction model, since DNAm-based prediction modeling has predominantly relied on analyzing DNAm patterns on autosomes.</p><p><strong>Results: </strong>We employed random forest regression (RFR) to construct age prediction models with publicly available DNAm Infinium 450 K microarray data of sex chromosomes from human whole blood and buffy coat samples and assessed the RFR model performance based on the root-mean squared error (RMSE) and the mean absolute deviation (MAD) of cross-validation. Four types of models were constructed consisting of DNAm probes on sex chromosomes only, on sex chromosomes and autosomes together, on sex chromosomes and/or autosomes with additional stratification by sex and/or age restriction, and reduced models comprising the top best performing sex chromosomal probes combined with six best performing autosomal probes from a previous study. Our data indicated no added predictive value of Y chromosomal DNAm markers in our best-performing prediction model, even though we acknowledged the potential of applying Y chromosomal markers for age prediction. Yet, a significantly improved accuracy of age prediction was observed using a restricted set of X chromosomal combined with the six best predicting autosomal DNAm probes. In this reduced model we noted an RMSE and MAD of 2.54 and 1.89 years, respectively. Particularly, four DNAm markers on the X chromosome exhibited a strong correlation with age, i.e., cg27064949 (DGAT2L6), cg04532200 (PLXNB3), cg01882566 (RPGR) and cg25140188 (annotated to an intergenic region).</p><p><strong>Conclusions: </strong>Our findings illustrate that an age prediction model built with a set of sex chromosomal markers combined with autosomal age-informative markers, may serve as a high accuracy model to predict chronological age and may be even competitive with commonly used model built with autosomal DNAm markers only. This study represents a step forward towards the application of epigenetic autosomal and sex chromosomal combined age prediction models for aging and forensic research. Highlights A set of age-prediction models based on DNA methylation (DNAm) markers on sex chromosomes and autosomes was constructed using random forest regression (RFR). From the total dataset containing 1291 whole blood and 547 buffy coat blood samples, 860 whole blood samples were used as training set and 481 as test set, while 365 buffy coat datasets were used as training set and 182 as test set. Cross-validation of the constructed RFR models using more than 10,000 X and 30 Y chromosomal DNAm markers from all collected blood samples, provided a root-mean squared error (RMSE) ranging from 7.70 to 14.29 years, and a mean absolute deviation (MAD) from 6.10 to 11.13 years. Models constructed using sex-stratified and age-restricted data subsets demonstrated comparable RMSE and MAD values to models constructed without stratification or restriction. Models constructed using a selected set of 37 X chromosomal and six autosomal DNAm markers exhibited a significantly improved age prediction performance with a minimum RMSE of 2.54 years and MAD of 1.89 years. A total of four X chromosomal DNAm markers were found to exhibit a significant correlation with age as indicated by a Spearman correlation coefficient of 0.50. In our data sets, Y chromosomal DNAm markers did not enhance predictive performance of our best-performing age prediction model, even though we acknowledge their recognized potential for age prediction accuracy.</p>","PeriodicalId":49253,"journal":{"name":"Epigenetics & Chromatin","volume":"18 1","pages":"45"},"PeriodicalIF":3.5000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12261677/pdf/","citationCount":"0","resultStr":"{\"title\":\"Improved epigenetic age prediction models by combining sex chromosome and autosomal markers.\",\"authors\":\"Zhong Wan, Peter Henneman, Huub C J Hoefsloot, Ate D Kloosterman, Pernette J Verschure\",\"doi\":\"10.1186/s13072-025-00606-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Alterations in epigenetic DNA methylation (DNAm) can be used as an accurate and robust method for biological age prediction. We assessed the feasibility of incorporating sex chromosomal DNAm markers into a six autosomal DNAm CpG marker-based age prediction model, since DNAm-based prediction modeling has predominantly relied on analyzing DNAm patterns on autosomes.</p><p><strong>Results: </strong>We employed random forest regression (RFR) to construct age prediction models with publicly available DNAm Infinium 450 K microarray data of sex chromosomes from human whole blood and buffy coat samples and assessed the RFR model performance based on the root-mean squared error (RMSE) and the mean absolute deviation (MAD) of cross-validation. Four types of models were constructed consisting of DNAm probes on sex chromosomes only, on sex chromosomes and autosomes together, on sex chromosomes and/or autosomes with additional stratification by sex and/or age restriction, and reduced models comprising the top best performing sex chromosomal probes combined with six best performing autosomal probes from a previous study. Our data indicated no added predictive value of Y chromosomal DNAm markers in our best-performing prediction model, even though we acknowledged the potential of applying Y chromosomal markers for age prediction. Yet, a significantly improved accuracy of age prediction was observed using a restricted set of X chromosomal combined with the six best predicting autosomal DNAm probes. In this reduced model we noted an RMSE and MAD of 2.54 and 1.89 years, respectively. Particularly, four DNAm markers on the X chromosome exhibited a strong correlation with age, i.e., cg27064949 (DGAT2L6), cg04532200 (PLXNB3), cg01882566 (RPGR) and cg25140188 (annotated to an intergenic region).</p><p><strong>Conclusions: </strong>Our findings illustrate that an age prediction model built with a set of sex chromosomal markers combined with autosomal age-informative markers, may serve as a high accuracy model to predict chronological age and may be even competitive with commonly used model built with autosomal DNAm markers only. This study represents a step forward towards the application of epigenetic autosomal and sex chromosomal combined age prediction models for aging and forensic research. Highlights A set of age-prediction models based on DNA methylation (DNAm) markers on sex chromosomes and autosomes was constructed using random forest regression (RFR). From the total dataset containing 1291 whole blood and 547 buffy coat blood samples, 860 whole blood samples were used as training set and 481 as test set, while 365 buffy coat datasets were used as training set and 182 as test set. Cross-validation of the constructed RFR models using more than 10,000 X and 30 Y chromosomal DNAm markers from all collected blood samples, provided a root-mean squared error (RMSE) ranging from 7.70 to 14.29 years, and a mean absolute deviation (MAD) from 6.10 to 11.13 years. Models constructed using sex-stratified and age-restricted data subsets demonstrated comparable RMSE and MAD values to models constructed without stratification or restriction. Models constructed using a selected set of 37 X chromosomal and six autosomal DNAm markers exhibited a significantly improved age prediction performance with a minimum RMSE of 2.54 years and MAD of 1.89 years. A total of four X chromosomal DNAm markers were found to exhibit a significant correlation with age as indicated by a Spearman correlation coefficient of 0.50. In our data sets, Y chromosomal DNAm markers did not enhance predictive performance of our best-performing age prediction model, even though we acknowledge their recognized potential for age prediction accuracy.</p>\",\"PeriodicalId\":49253,\"journal\":{\"name\":\"Epigenetics & Chromatin\",\"volume\":\"18 1\",\"pages\":\"45\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12261677/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Epigenetics & Chromatin\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13072-025-00606-5\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epigenetics & Chromatin","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13072-025-00606-5","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Improved epigenetic age prediction models by combining sex chromosome and autosomal markers.
Background: Alterations in epigenetic DNA methylation (DNAm) can be used as an accurate and robust method for biological age prediction. We assessed the feasibility of incorporating sex chromosomal DNAm markers into a six autosomal DNAm CpG marker-based age prediction model, since DNAm-based prediction modeling has predominantly relied on analyzing DNAm patterns on autosomes.
Results: We employed random forest regression (RFR) to construct age prediction models with publicly available DNAm Infinium 450 K microarray data of sex chromosomes from human whole blood and buffy coat samples and assessed the RFR model performance based on the root-mean squared error (RMSE) and the mean absolute deviation (MAD) of cross-validation. Four types of models were constructed consisting of DNAm probes on sex chromosomes only, on sex chromosomes and autosomes together, on sex chromosomes and/or autosomes with additional stratification by sex and/or age restriction, and reduced models comprising the top best performing sex chromosomal probes combined with six best performing autosomal probes from a previous study. Our data indicated no added predictive value of Y chromosomal DNAm markers in our best-performing prediction model, even though we acknowledged the potential of applying Y chromosomal markers for age prediction. Yet, a significantly improved accuracy of age prediction was observed using a restricted set of X chromosomal combined with the six best predicting autosomal DNAm probes. In this reduced model we noted an RMSE and MAD of 2.54 and 1.89 years, respectively. Particularly, four DNAm markers on the X chromosome exhibited a strong correlation with age, i.e., cg27064949 (DGAT2L6), cg04532200 (PLXNB3), cg01882566 (RPGR) and cg25140188 (annotated to an intergenic region).
Conclusions: Our findings illustrate that an age prediction model built with a set of sex chromosomal markers combined with autosomal age-informative markers, may serve as a high accuracy model to predict chronological age and may be even competitive with commonly used model built with autosomal DNAm markers only. This study represents a step forward towards the application of epigenetic autosomal and sex chromosomal combined age prediction models for aging and forensic research. Highlights A set of age-prediction models based on DNA methylation (DNAm) markers on sex chromosomes and autosomes was constructed using random forest regression (RFR). From the total dataset containing 1291 whole blood and 547 buffy coat blood samples, 860 whole blood samples were used as training set and 481 as test set, while 365 buffy coat datasets were used as training set and 182 as test set. Cross-validation of the constructed RFR models using more than 10,000 X and 30 Y chromosomal DNAm markers from all collected blood samples, provided a root-mean squared error (RMSE) ranging from 7.70 to 14.29 years, and a mean absolute deviation (MAD) from 6.10 to 11.13 years. Models constructed using sex-stratified and age-restricted data subsets demonstrated comparable RMSE and MAD values to models constructed without stratification or restriction. Models constructed using a selected set of 37 X chromosomal and six autosomal DNAm markers exhibited a significantly improved age prediction performance with a minimum RMSE of 2.54 years and MAD of 1.89 years. A total of four X chromosomal DNAm markers were found to exhibit a significant correlation with age as indicated by a Spearman correlation coefficient of 0.50. In our data sets, Y chromosomal DNAm markers did not enhance predictive performance of our best-performing age prediction model, even though we acknowledge their recognized potential for age prediction accuracy.
期刊介绍:
Epigenetics & Chromatin is a peer-reviewed, open access, online journal that publishes research, and reviews, providing novel insights into epigenetic inheritance and chromatin-based interactions. The journal aims to understand how gene and chromosomal elements are regulated and their activities maintained during processes such as cell division, differentiation and environmental alteration.