Machine learning-based prediction of preterm birth risk using methylation changes in neonatal cord blood CpG sites.

IF 2.7 2区 医学 Q1 OBSTETRICS & GYNECOLOGY
Yuxin Feng, Ying Ni, Wenkai Wang, Fen Guo, Liyu Wang, Fan Zhu, Luyao Zhang, Ying Feng
{"title":"Machine learning-based prediction of preterm birth risk using methylation changes in neonatal cord blood CpG sites.","authors":"Yuxin Feng, Ying Ni, Wenkai Wang, Fen Guo, Liyu Wang, Fan Zhu, Luyao Zhang, Ying Feng","doi":"10.1186/s12884-025-07884-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Preterm birth, defined as delivery before 37 weeks of gestation, is a major cause of neonatal morbidity and mortality. DNA methylation changes at CpG sites have been associated with the risk of preterm birth.</p><p><strong>Objective: </strong>This study aimed to identify differential CpG sites in cord blood and develop predictive machine learning models based on these methylation changes to assess preterm birth risk.</p><p><strong>Methods: </strong>Methylome data from 110 neonatal cord blood samples in the GSE110828 dataset were analyzed to identify CpG sites differing between preterm and full-term births (88 for training, and 22 for testing, respectively). Key CpG sites were selected using Lasso, Elastic Net, and Random Forest. Forty-five predictive models were constructed and evaluated for accuracy, precision, recall, and F1 score.</p><p><strong>Results: </strong>Sixty-six CpG sites showed significant differences between preterm and full-term groups. Four models, including Random Forest with Lasso and Gradient Boosting with Random Forest, achieved optimal predictive performance, each with a validation accuracy of 93.75%.</p><p><strong>Conclusion: </strong>DNA methylation changes at CpG sites in cord blood are associated with preterm birth risk. CpG-based methylation models demonstrate high predictive accuracy and hold promise for early clinical risk assessment.</p>","PeriodicalId":9033,"journal":{"name":"BMC Pregnancy and Childbirth","volume":"25 1","pages":"784"},"PeriodicalIF":2.7000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12285009/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Pregnancy and Childbirth","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12884-025-07884-7","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Preterm birth, defined as delivery before 37 weeks of gestation, is a major cause of neonatal morbidity and mortality. DNA methylation changes at CpG sites have been associated with the risk of preterm birth.

Objective: This study aimed to identify differential CpG sites in cord blood and develop predictive machine learning models based on these methylation changes to assess preterm birth risk.

Methods: Methylome data from 110 neonatal cord blood samples in the GSE110828 dataset were analyzed to identify CpG sites differing between preterm and full-term births (88 for training, and 22 for testing, respectively). Key CpG sites were selected using Lasso, Elastic Net, and Random Forest. Forty-five predictive models were constructed and evaluated for accuracy, precision, recall, and F1 score.

Results: Sixty-six CpG sites showed significant differences between preterm and full-term groups. Four models, including Random Forest with Lasso and Gradient Boosting with Random Forest, achieved optimal predictive performance, each with a validation accuracy of 93.75%.

Conclusion: DNA methylation changes at CpG sites in cord blood are associated with preterm birth risk. CpG-based methylation models demonstrate high predictive accuracy and hold promise for early clinical risk assessment.

使用新生儿脐带血CpG位点甲基化变化预测早产风险的机器学习。
背景:早产,定义为妊娠37周前分娩,是新生儿发病率和死亡率的主要原因。CpG位点的DNA甲基化变化与早产风险有关。目的:本研究旨在识别脐带血中不同的CpG位点,并基于这些甲基化变化建立预测机器学习模型,以评估早产风险。方法:对GSE110828数据集中110份新生儿脐带血样本的甲基组数据进行分析,以确定早产儿和足月新生儿之间不同的CpG位点(88个用于训练,22个用于检测)。使用Lasso、Elastic Net和Random Forest选择关键CpG位点。构建了45个预测模型,并对准确性、精密度、召回率和F1评分进行了评估。结果:66个CpG位点在早产儿和足月组之间存在显著差异。随机森林Lasso模型和随机森林Gradient Boosting模型均达到了最优的预测效果,验证准确率均为93.75%。结论:脐带血CpG位点DNA甲基化变化与早产风险相关。基于cpg的甲基化模型显示出较高的预测准确性,并有望用于早期临床风险评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Pregnancy and Childbirth
BMC Pregnancy and Childbirth OBSTETRICS & GYNECOLOGY-
CiteScore
4.90
自引率
6.50%
发文量
845
审稿时长
3-8 weeks
期刊介绍: BMC Pregnancy & Childbirth is an open access, peer-reviewed journal that considers articles on all aspects of pregnancy and childbirth. The journal welcomes submissions on the biomedical aspects of pregnancy, breastfeeding, labor, maternal health, maternity care, trends and sociological aspects of pregnancy and childbirth.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信