Development and validation of a machine learning-based predictive model for live birth outcomes following fresh embryo transfer in patients with endometriosis.

IF 2.7 3区 医学 Q2 GENETICS & HEREDITY
Suqin Zhu, Huiling Xu, Rongshan Li, Xiaojing Chen, Wenwen Jiang, Beihong Zheng, Yan Sun
{"title":"Development and validation of a machine learning-based predictive model for live birth outcomes following fresh embryo transfer in patients with endometriosis.","authors":"Suqin Zhu, Huiling Xu, Rongshan Li, Xiaojing Chen, Wenwen Jiang, Beihong Zheng, Yan Sun","doi":"10.1007/s10815-025-03677-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aims to develop a machine learning-based predictive model for patients with endometriosis, with the goal of precisely identifying key factors and reliable predictive markers that influence live birth outcomes following fresh embryo transfer. Through systematic evaluation of multiple algorithms, efforts will be made to identify the optimal model for elucidating high-risk factors affecting live birth, thereby providing a basis for formulating targeted interventions to enhance the live birth rate in this population undergoing in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI).</p><p><strong>Methods: </strong>This study adopted a retrospective cohort design and included 1836 patients with endometriosis who underwent fresh embryo transfer via in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI) at Fujian Provincial Maternity and Children's Hospital between 2018 and 2023. Participants were randomly allocated to either the training set or the validation set, with a 70:30 split (1285 in the training set and 551 in the validation set), making this an internal validation study. Independent variables were screened using the least absolute shrinkage and selection operator (LASSO) and recursive feature elimination (RFE) algorithms. For eight machine learning models, namely decision tree (DT), K-nearest neighbor (KNN), logistic regression (LR), light gradient boosting machine (LightGBM), naive Bayes model (NBM), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost), we determined the optimal hyperparameter configurations using the grid search strategy. All models were trained, and their performances were evaluated through receiver operating characteristic (ROC) curves, calibration curves, decision curve analysis (DCA), and Brier score (BS). The results showed that the XGBoost model exhibited the best predictive performance and was thus selected as the final modeling solution. In addition, the feature importance analysis combined with the SHapley Additive exPlanations (SHAP) value dependency plots systematically revealed the relative contributions and influence mechanisms of key features on the model predictions.</p><p><strong>Results: </strong>Lasso and RFE analyses identified eight predictive variables for model development. The AUC values for DT, KNN, LightGBM, LR, naive Bayes, RF, SVM, and XGBoost in the training set were 0.784, 0.987, 0.841, 0.800, 0.803, 0.988, 0.799, and 0.920, while those in the test set were 0.765, 0.748, 0.801, 0.805, 0.810, 0.820, 0.807, and 0.852, respectively. XGBoost demonstrated the highest predictive performance among all models. SHAP analysis identified anti-Mullerian hormone (AMH), female age, antral follicle count (AFC), infertility duration, GnRH agonist protocol, revised American Fertility Society (rAFS) stage, normal fertilization number, and number of transferred embryos as key predictors for live birth following fresh embryo transfer in patients with endometriosis.</p><p><strong>Conclusion: </strong>This study developed a machine learning-based predictive model for live birth following fresh embryo transfer in patients with endometriosis and systematically evaluated the comparative performance of multiple algorithms. The XGBoost model demonstrated superior overall performance, facilitating timely and precise identification of high-risk factors influencing live birth outcomes. These findings can inform targeted interventions to improve pregnancy outcomes in women with endometriosis.</p>","PeriodicalId":15246,"journal":{"name":"Journal of Assisted Reproduction and Genetics","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Assisted Reproduction and Genetics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10815-025-03677-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: This study aims to develop a machine learning-based predictive model for patients with endometriosis, with the goal of precisely identifying key factors and reliable predictive markers that influence live birth outcomes following fresh embryo transfer. Through systematic evaluation of multiple algorithms, efforts will be made to identify the optimal model for elucidating high-risk factors affecting live birth, thereby providing a basis for formulating targeted interventions to enhance the live birth rate in this population undergoing in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI).

Methods: This study adopted a retrospective cohort design and included 1836 patients with endometriosis who underwent fresh embryo transfer via in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI) at Fujian Provincial Maternity and Children's Hospital between 2018 and 2023. Participants were randomly allocated to either the training set or the validation set, with a 70:30 split (1285 in the training set and 551 in the validation set), making this an internal validation study. Independent variables were screened using the least absolute shrinkage and selection operator (LASSO) and recursive feature elimination (RFE) algorithms. For eight machine learning models, namely decision tree (DT), K-nearest neighbor (KNN), logistic regression (LR), light gradient boosting machine (LightGBM), naive Bayes model (NBM), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost), we determined the optimal hyperparameter configurations using the grid search strategy. All models were trained, and their performances were evaluated through receiver operating characteristic (ROC) curves, calibration curves, decision curve analysis (DCA), and Brier score (BS). The results showed that the XGBoost model exhibited the best predictive performance and was thus selected as the final modeling solution. In addition, the feature importance analysis combined with the SHapley Additive exPlanations (SHAP) value dependency plots systematically revealed the relative contributions and influence mechanisms of key features on the model predictions.

Results: Lasso and RFE analyses identified eight predictive variables for model development. The AUC values for DT, KNN, LightGBM, LR, naive Bayes, RF, SVM, and XGBoost in the training set were 0.784, 0.987, 0.841, 0.800, 0.803, 0.988, 0.799, and 0.920, while those in the test set were 0.765, 0.748, 0.801, 0.805, 0.810, 0.820, 0.807, and 0.852, respectively. XGBoost demonstrated the highest predictive performance among all models. SHAP analysis identified anti-Mullerian hormone (AMH), female age, antral follicle count (AFC), infertility duration, GnRH agonist protocol, revised American Fertility Society (rAFS) stage, normal fertilization number, and number of transferred embryos as key predictors for live birth following fresh embryo transfer in patients with endometriosis.

Conclusion: This study developed a machine learning-based predictive model for live birth following fresh embryo transfer in patients with endometriosis and systematically evaluated the comparative performance of multiple algorithms. The XGBoost model demonstrated superior overall performance, facilitating timely and precise identification of high-risk factors influencing live birth outcomes. These findings can inform targeted interventions to improve pregnancy outcomes in women with endometriosis.

基于机器学习的子宫内膜异位症患者新鲜胚胎移植后活产结果预测模型的开发和验证。
目的:本研究旨在建立一种基于机器学习的子宫内膜异位症患者预测模型,以精确识别影响新鲜胚胎移植后活产结局的关键因素和可靠的预测标志物。通过对多种算法的系统评价,找出最优模型来阐明影响活产的高危因素,从而为制定有针对性的干预措施提供依据,提高该人群体外受精/胞浆内单精子注射(IVF/ICSI)的活产率。方法:本研究采用回顾性队列设计,纳入2018 - 2023年福建省妇幼医院1836例经体外受精/卵浆内单精子注射(IVF/ICSI)进行新鲜胚胎移植的子宫内膜异位症患者。参与者被随机分配到训练集或验证集,以70:30的分割(训练集1285,验证集551),使其成为内部验证研究。使用最小绝对收缩和选择算子(LASSO)和递归特征消除(RFE)算法筛选自变量。对于决策树(DT)、k近邻(KNN)、逻辑回归(LR)、轻梯度增强机(LightGBM)、朴素贝叶斯模型(NBM)、随机森林(RF)、支持向量机(SVM)和极端梯度增强(XGBoost)等8种机器学习模型,我们使用网格搜索策略确定了最优超参数配置。对所有模型进行训练,并通过受试者工作特征(ROC)曲线、校准曲线、决策曲线分析(DCA)和Brier评分(BS)评价模型的性能。结果表明,XGBoost模型具有最佳的预测性能,因此被选为最终的建模方案。此外,特征重要性分析结合SHapley加性解释(SHAP)值依赖图系统地揭示了关键特征对模型预测的相对贡献和影响机制。结果:Lasso和RFE分析确定了模型开发的8个预测变量。DT、KNN、LightGBM、LR、朴素贝叶斯、RF、SVM和XGBoost在训练集中的AUC值分别为0.784、0.987、0.841、0.800、0.803、0.988、0.799和0.920,在测试集中的AUC值分别为0.765、0.748、0.801、0.805、0.810、0.820、0.807和0.852。XGBoost在所有模型中表现出最高的预测性能。SHAP分析发现,抗苗勒管激素(AMH)、女性年龄、卵泡计数(AFC)、不孕持续时间、GnRH激动剂方案、修订的美国生育学会(rAFS)分期、正常受精数量和移植胚胎数量是子宫内膜异位症患者新鲜胚胎移植后活产的关键预测因素。结论:本研究建立了基于机器学习的子宫内膜异位症患者新鲜胚胎移植后活产预测模型,并系统评估了多种算法的比较性能。XGBoost模型整体性能优越,有助于及时、准确地识别影响活产结局的高危因素。这些发现可以为有针对性的干预措施提供信息,以改善子宫内膜异位症患者的妊娠结局。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.70
自引率
9.70%
发文量
286
审稿时长
1 months
期刊介绍: The Journal of Assisted Reproduction and Genetics publishes cellular, molecular, genetic, and epigenetic discoveries advancing our understanding of the biology and underlying mechanisms from gametogenesis to offspring health. Special emphasis is placed on the practice and evolution of assisted reproduction technologies (ARTs) with reference to the diagnosis and management of diseases affecting fertility. Our goal is to educate our readership in the translation of basic and clinical discoveries made from human or relevant animal models to the safe and efficacious practice of human ARTs. The scientific rigor and ethical standards embraced by the JARG editorial team ensures a broad international base of expertise guiding the marriage of contemporary clinical research paradigms with basic science discovery. JARG publishes original papers, minireviews, case reports, and opinion pieces often combined into special topic issues that will educate clinicians and scientists with interests in the mechanisms of human development that bear on the treatment of infertility and emerging innovations in human ARTs. The guiding principles of male and female reproductive health impacting pre- and post-conceptional viability and developmental potential are emphasized within the purview of human reproductive health in current and future generations of our species. The journal is published in cooperation with the American Society for Reproductive Medicine, an organization of more than 8,000 physicians, researchers, nurses, technicians and other professionals dedicated to advancing knowledge and expertise in reproductive biology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信