Suqin Zhu, Huiling Xu, Rongshan Li, Xiaojing Chen, Wenwen Jiang, Beihong Zheng, Yan Sun
{"title":"基于机器学习的子宫内膜异位症患者新鲜胚胎移植后活产结果预测模型的开发和验证。","authors":"Suqin Zhu, Huiling Xu, Rongshan Li, Xiaojing Chen, Wenwen Jiang, Beihong Zheng, Yan Sun","doi":"10.1007/s10815-025-03677-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aims to develop a machine learning-based predictive model for patients with endometriosis, with the goal of precisely identifying key factors and reliable predictive markers that influence live birth outcomes following fresh embryo transfer. Through systematic evaluation of multiple algorithms, efforts will be made to identify the optimal model for elucidating high-risk factors affecting live birth, thereby providing a basis for formulating targeted interventions to enhance the live birth rate in this population undergoing in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI).</p><p><strong>Methods: </strong>This study adopted a retrospective cohort design and included 1836 patients with endometriosis who underwent fresh embryo transfer via in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI) at Fujian Provincial Maternity and Children's Hospital between 2018 and 2023. Participants were randomly allocated to either the training set or the validation set, with a 70:30 split (1285 in the training set and 551 in the validation set), making this an internal validation study. Independent variables were screened using the least absolute shrinkage and selection operator (LASSO) and recursive feature elimination (RFE) algorithms. For eight machine learning models, namely decision tree (DT), K-nearest neighbor (KNN), logistic regression (LR), light gradient boosting machine (LightGBM), naive Bayes model (NBM), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost), we determined the optimal hyperparameter configurations using the grid search strategy. All models were trained, and their performances were evaluated through receiver operating characteristic (ROC) curves, calibration curves, decision curve analysis (DCA), and Brier score (BS). The results showed that the XGBoost model exhibited the best predictive performance and was thus selected as the final modeling solution. In addition, the feature importance analysis combined with the SHapley Additive exPlanations (SHAP) value dependency plots systematically revealed the relative contributions and influence mechanisms of key features on the model predictions.</p><p><strong>Results: </strong>Lasso and RFE analyses identified eight predictive variables for model development. The AUC values for DT, KNN, LightGBM, LR, naive Bayes, RF, SVM, and XGBoost in the training set were 0.784, 0.987, 0.841, 0.800, 0.803, 0.988, 0.799, and 0.920, while those in the test set were 0.765, 0.748, 0.801, 0.805, 0.810, 0.820, 0.807, and 0.852, respectively. XGBoost demonstrated the highest predictive performance among all models. SHAP analysis identified anti-Mullerian hormone (AMH), female age, antral follicle count (AFC), infertility duration, GnRH agonist protocol, revised American Fertility Society (rAFS) stage, normal fertilization number, and number of transferred embryos as key predictors for live birth following fresh embryo transfer in patients with endometriosis.</p><p><strong>Conclusion: </strong>This study developed a machine learning-based predictive model for live birth following fresh embryo transfer in patients with endometriosis and systematically evaluated the comparative performance of multiple algorithms. The XGBoost model demonstrated superior overall performance, facilitating timely and precise identification of high-risk factors influencing live birth outcomes. These findings can inform targeted interventions to improve pregnancy outcomes in women with endometriosis.</p>","PeriodicalId":15246,"journal":{"name":"Journal of Assisted Reproduction and Genetics","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Development and validation of a machine learning-based predictive model for live birth outcomes following fresh embryo transfer in patients with endometriosis.\",\"authors\":\"Suqin Zhu, Huiling Xu, Rongshan Li, Xiaojing Chen, Wenwen Jiang, Beihong Zheng, Yan Sun\",\"doi\":\"10.1007/s10815-025-03677-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>This study aims to develop a machine learning-based predictive model for patients with endometriosis, with the goal of precisely identifying key factors and reliable predictive markers that influence live birth outcomes following fresh embryo transfer. Through systematic evaluation of multiple algorithms, efforts will be made to identify the optimal model for elucidating high-risk factors affecting live birth, thereby providing a basis for formulating targeted interventions to enhance the live birth rate in this population undergoing in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI).</p><p><strong>Methods: </strong>This study adopted a retrospective cohort design and included 1836 patients with endometriosis who underwent fresh embryo transfer via in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI) at Fujian Provincial Maternity and Children's Hospital between 2018 and 2023. Participants were randomly allocated to either the training set or the validation set, with a 70:30 split (1285 in the training set and 551 in the validation set), making this an internal validation study. Independent variables were screened using the least absolute shrinkage and selection operator (LASSO) and recursive feature elimination (RFE) algorithms. For eight machine learning models, namely decision tree (DT), K-nearest neighbor (KNN), logistic regression (LR), light gradient boosting machine (LightGBM), naive Bayes model (NBM), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost), we determined the optimal hyperparameter configurations using the grid search strategy. All models were trained, and their performances were evaluated through receiver operating characteristic (ROC) curves, calibration curves, decision curve analysis (DCA), and Brier score (BS). The results showed that the XGBoost model exhibited the best predictive performance and was thus selected as the final modeling solution. In addition, the feature importance analysis combined with the SHapley Additive exPlanations (SHAP) value dependency plots systematically revealed the relative contributions and influence mechanisms of key features on the model predictions.</p><p><strong>Results: </strong>Lasso and RFE analyses identified eight predictive variables for model development. The AUC values for DT, KNN, LightGBM, LR, naive Bayes, RF, SVM, and XGBoost in the training set were 0.784, 0.987, 0.841, 0.800, 0.803, 0.988, 0.799, and 0.920, while those in the test set were 0.765, 0.748, 0.801, 0.805, 0.810, 0.820, 0.807, and 0.852, respectively. XGBoost demonstrated the highest predictive performance among all models. SHAP analysis identified anti-Mullerian hormone (AMH), female age, antral follicle count (AFC), infertility duration, GnRH agonist protocol, revised American Fertility Society (rAFS) stage, normal fertilization number, and number of transferred embryos as key predictors for live birth following fresh embryo transfer in patients with endometriosis.</p><p><strong>Conclusion: </strong>This study developed a machine learning-based predictive model for live birth following fresh embryo transfer in patients with endometriosis and systematically evaluated the comparative performance of multiple algorithms. The XGBoost model demonstrated superior overall performance, facilitating timely and precise identification of high-risk factors influencing live birth outcomes. These findings can inform targeted interventions to improve pregnancy outcomes in women with endometriosis.</p>\",\"PeriodicalId\":15246,\"journal\":{\"name\":\"Journal of Assisted Reproduction and Genetics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Assisted Reproduction and Genetics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s10815-025-03677-1\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Assisted Reproduction and Genetics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10815-025-03677-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Development and validation of a machine learning-based predictive model for live birth outcomes following fresh embryo transfer in patients with endometriosis.
Objective: This study aims to develop a machine learning-based predictive model for patients with endometriosis, with the goal of precisely identifying key factors and reliable predictive markers that influence live birth outcomes following fresh embryo transfer. Through systematic evaluation of multiple algorithms, efforts will be made to identify the optimal model for elucidating high-risk factors affecting live birth, thereby providing a basis for formulating targeted interventions to enhance the live birth rate in this population undergoing in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI).
Methods: This study adopted a retrospective cohort design and included 1836 patients with endometriosis who underwent fresh embryo transfer via in vitro fertilization/intracytoplasmic sperm injection (IVF/ICSI) at Fujian Provincial Maternity and Children's Hospital between 2018 and 2023. Participants were randomly allocated to either the training set or the validation set, with a 70:30 split (1285 in the training set and 551 in the validation set), making this an internal validation study. Independent variables were screened using the least absolute shrinkage and selection operator (LASSO) and recursive feature elimination (RFE) algorithms. For eight machine learning models, namely decision tree (DT), K-nearest neighbor (KNN), logistic regression (LR), light gradient boosting machine (LightGBM), naive Bayes model (NBM), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost), we determined the optimal hyperparameter configurations using the grid search strategy. All models were trained, and their performances were evaluated through receiver operating characteristic (ROC) curves, calibration curves, decision curve analysis (DCA), and Brier score (BS). The results showed that the XGBoost model exhibited the best predictive performance and was thus selected as the final modeling solution. In addition, the feature importance analysis combined with the SHapley Additive exPlanations (SHAP) value dependency plots systematically revealed the relative contributions and influence mechanisms of key features on the model predictions.
Results: Lasso and RFE analyses identified eight predictive variables for model development. The AUC values for DT, KNN, LightGBM, LR, naive Bayes, RF, SVM, and XGBoost in the training set were 0.784, 0.987, 0.841, 0.800, 0.803, 0.988, 0.799, and 0.920, while those in the test set were 0.765, 0.748, 0.801, 0.805, 0.810, 0.820, 0.807, and 0.852, respectively. XGBoost demonstrated the highest predictive performance among all models. SHAP analysis identified anti-Mullerian hormone (AMH), female age, antral follicle count (AFC), infertility duration, GnRH agonist protocol, revised American Fertility Society (rAFS) stage, normal fertilization number, and number of transferred embryos as key predictors for live birth following fresh embryo transfer in patients with endometriosis.
Conclusion: This study developed a machine learning-based predictive model for live birth following fresh embryo transfer in patients with endometriosis and systematically evaluated the comparative performance of multiple algorithms. The XGBoost model demonstrated superior overall performance, facilitating timely and precise identification of high-risk factors influencing live birth outcomes. These findings can inform targeted interventions to improve pregnancy outcomes in women with endometriosis.
期刊介绍:
The Journal of Assisted Reproduction and Genetics publishes cellular, molecular, genetic, and epigenetic discoveries advancing our understanding of the biology and underlying mechanisms from gametogenesis to offspring health. Special emphasis is placed on the practice and evolution of assisted reproduction technologies (ARTs) with reference to the diagnosis and management of diseases affecting fertility. Our goal is to educate our readership in the translation of basic and clinical discoveries made from human or relevant animal models to the safe and efficacious practice of human ARTs. The scientific rigor and ethical standards embraced by the JARG editorial team ensures a broad international base of expertise guiding the marriage of contemporary clinical research paradigms with basic science discovery. JARG publishes original papers, minireviews, case reports, and opinion pieces often combined into special topic issues that will educate clinicians and scientists with interests in the mechanisms of human development that bear on the treatment of infertility and emerging innovations in human ARTs. The guiding principles of male and female reproductive health impacting pre- and post-conceptional viability and developmental potential are emphasized within the purview of human reproductive health in current and future generations of our species.
The journal is published in cooperation with the American Society for Reproductive Medicine, an organization of more than 8,000 physicians, researchers, nurses, technicians and other professionals dedicated to advancing knowledge and expertise in reproductive biology.