Comparative analysis of convolutional neural networks and traditional machine learning models for IVF live birth prediction: a retrospective analysis of 48514 IVF cycles and an evaluation of deployment feasibility in resource-constrained settings.

IF 3.9 2区 医学 Q2 ENDOCRINOLOGY & METABOLISM
Frontiers in Endocrinology Pub Date : 2025-06-12 eCollection Date: 2025-01-01 DOI:10.3389/fendo.2025.1556681
Yu Liu, Yi Wang, Kai Huang, Hao Shi, Hang Xin, Shanjun Dai, Jinhao Liu, Xinhong Yang, Jianyuan Song, Fuli Zhang, Yihong Guo
{"title":"Comparative analysis of convolutional neural networks and traditional machine learning models for IVF live birth prediction: a retrospective analysis of 48514 IVF cycles and an evaluation of deployment feasibility in resource-constrained settings.","authors":"Yu Liu, Yi Wang, Kai Huang, Hao Shi, Hang Xin, Shanjun Dai, Jinhao Liu, Xinhong Yang, Jianyuan Song, Fuli Zhang, Yihong Guo","doi":"10.3389/fendo.2025.1556681","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To evaluate the predictive performance of a convolutional neural network for analyzing electronic medical records in assisted reproductive therapy and to compare its accuracy and interpretability with traditional machine learning models. The study also explores the feasibility of deploying such models in resource-limited clinical settings.</p><p><strong>Design: </strong>Retrospective cohort study based on EMR data using five models: CNN, Naïve Bayes, Random Forest, Decision Tree, and Feedforward Neural Network. Feature importance and model interpretability were evaluated using SHAP.</p><p><strong>Setting: </strong>First Hospital of Zhengzhou University.</p><p><strong>Population: </strong>48,514 fresh IVF cycles from August 2009 to May 2018.</p><p><strong>Methods: </strong>Preprocessed EMR data were used to train and evaluate five classification models predicting live birth outcomes. Stratified 5-fold cross-validation was performed for robust performance estimation. ROC curves and AUC values were used for comparative evaluation.</p><p><strong>Main outcome measure: </strong>Live birth.</p><p><strong>Results: </strong>The CNN model achieved an accuracy of 0.9394 ± 0.0013, AUC of 0.8899 ± 0.0032, precision of 0.9348 ± 0.0018, recall of 0.9993 ± 0.0012, and F1 score of 0.9660 ± 0.0007. Its performance was comparable to Random Forest (accuracy: 0.9406 ± 0.0017, AUC: 0.9734 ± 0.0012), and superior to Decision Tree, Naïve Bayes, and Feedforward Neural Network in recall and robustness. CNN demonstrated stable convergence during training, and SHAP-based interpretation highlighted maternal age, BMI, antral follicle count, and gonadotropin dosage as the top predictors for live birth outcome.</p><p><strong>Conclusions: </strong>With appropriate input transformation, CNNs can effectively model structured EMR data and offer predictive performance comparable to ensemble methods. Their scalability, high sensitivity, and interpretability make CNNs promising candidates for integration into clinical workflows, particularly in environments with limited computational resources.</p>","PeriodicalId":12447,"journal":{"name":"Frontiers in Endocrinology","volume":"16 ","pages":"1556681"},"PeriodicalIF":3.9000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12197960/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Endocrinology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fendo.2025.1556681","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: To evaluate the predictive performance of a convolutional neural network for analyzing electronic medical records in assisted reproductive therapy and to compare its accuracy and interpretability with traditional machine learning models. The study also explores the feasibility of deploying such models in resource-limited clinical settings.

Design: Retrospective cohort study based on EMR data using five models: CNN, Naïve Bayes, Random Forest, Decision Tree, and Feedforward Neural Network. Feature importance and model interpretability were evaluated using SHAP.

Setting: First Hospital of Zhengzhou University.

Population: 48,514 fresh IVF cycles from August 2009 to May 2018.

Methods: Preprocessed EMR data were used to train and evaluate five classification models predicting live birth outcomes. Stratified 5-fold cross-validation was performed for robust performance estimation. ROC curves and AUC values were used for comparative evaluation.

Main outcome measure: Live birth.

Results: The CNN model achieved an accuracy of 0.9394 ± 0.0013, AUC of 0.8899 ± 0.0032, precision of 0.9348 ± 0.0018, recall of 0.9993 ± 0.0012, and F1 score of 0.9660 ± 0.0007. Its performance was comparable to Random Forest (accuracy: 0.9406 ± 0.0017, AUC: 0.9734 ± 0.0012), and superior to Decision Tree, Naïve Bayes, and Feedforward Neural Network in recall and robustness. CNN demonstrated stable convergence during training, and SHAP-based interpretation highlighted maternal age, BMI, antral follicle count, and gonadotropin dosage as the top predictors for live birth outcome.

Conclusions: With appropriate input transformation, CNNs can effectively model structured EMR data and offer predictive performance comparable to ensemble methods. Their scalability, high sensitivity, and interpretability make CNNs promising candidates for integration into clinical workflows, particularly in environments with limited computational resources.

卷积神经网络与传统机器学习模型在试管婴儿活产预测中的比较分析:48514个试管婴儿周期的回顾性分析和资源受限环境下部署可行性的评估。
目的:评价卷积神经网络在辅助生殖治疗电子病历分析中的预测性能,并与传统机器学习模型比较其准确性和可解释性。该研究还探讨了在资源有限的临床环境中部署这种模型的可行性。设计:基于EMR数据的回顾性队列研究,采用CNN、Naïve贝叶斯、随机森林、决策树和前馈神经网络五种模型。使用SHAP对特征重要性和模型可解释性进行评估。单位:郑州大学第一医院。人口:从2009年8月到2018年5月,48,514个新鲜试管婴儿周期。方法:使用预处理后的EMR数据训练和评估预测活产结局的五种分类模型。分层5重交叉验证进行稳健的性能估计。采用ROC曲线和AUC值进行比较评价。主要结局指标:活产。结果:CNN模型准确率为0.9394±0.0013,AUC为0.8899±0.0032,精密度为0.9348±0.0018,召回率为0.9993±0.0012,F1评分为0.9660±0.0007。其性能与随机森林相当(准确率:0.9406±0.0017,AUC: 0.9734±0.0012),在召回率和鲁棒性方面优于决策树、Naïve贝叶斯和前馈神经网络。CNN在训练过程中表现出稳定的收敛性,基于shap的解释强调了产妇年龄、BMI、窦卵泡计数和促性腺激素剂量是活产结局的主要预测因子。结论:通过适当的输入变换,cnn可以有效地对结构化EMR数据进行建模,并提供与集成方法相当的预测性能。它们的可扩展性、高灵敏度和可解释性使cnn成为整合到临床工作流程中的有希望的候选者,特别是在计算资源有限的环境中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Frontiers in Endocrinology
Frontiers in Endocrinology Medicine-Endocrinology, Diabetes and Metabolism
CiteScore
5.70
自引率
9.60%
发文量
3023
审稿时长
14 weeks
期刊介绍: Frontiers in Endocrinology is a field journal of the "Frontiers in" journal series. In today’s world, endocrinology is becoming increasingly important as it underlies many of the challenges societies face - from obesity and diabetes to reproduction, population control and aging. Endocrinology covers a broad field from basic molecular and cellular communication through to clinical care and some of the most crucial public health issues. The journal, thus, welcomes outstanding contributions in any domain of endocrinology. Frontiers in Endocrinology publishes articles on the most outstanding discoveries across a wide research spectrum of Endocrinology. The mission of Frontiers in Endocrinology is to bring all relevant Endocrinology areas together on a single platform.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信