Derivation and Validation of Prediction of Preterm Preeclampsia Using Machine Learning Algorithms.

IF 1.5 4区医学 Q3 OBSTETRICS & GYNECOLOGY

American journal of perinatology Pub Date : 2025-07-01 Epub Date: 2024-12-04 DOI:10.1055/a-2495-3600

Tetsuya Kawakita, Juliana G Martins, Yara H Diab, Lea Nehme, George Saade

{"title":"Derivation and Validation of Prediction of Preterm Preeclampsia Using Machine Learning Algorithms.","authors":"Tetsuya Kawakita, Juliana G Martins, Yara H Diab, Lea Nehme, George Saade","doi":"10.1055/a-2495-3600","DOIUrl":null,"url":null,"abstract":"This study aimed to develop machine learning (ML) models for predicting preterm preeclampsia using the information available before 23 weeks gestation.This was a secondary analysis of the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nuMoM2b) cohort. We considered 131 features available before 23 weeks including maternal demographics, obstetrics and family history, social determinants of health, physical activity, nutrition, and early second-trimester ultrasound. Our primary outcome was preterm preeclampsia before 37 weeks. The dataset was randomly split into a training set (70%) and a validation set (30%). ML models using glmnet, multilayer perceptron, random forest, XGBoost (extreme gradient boosting), and LightGBM models were developed. Using the ML approach that achieved the best area under the curve (AUC), we developed the final model. Further feature selection was conducted from the top 25 important features based on SHapley Additive exPlanations (SHAP) values. The performance of the final model was assessed using the validation dataset.Of 9,467 individuals, 219 (2.3%) had preterm preeclampsia. The AUC of the XGBoost model was the highest (AUC = 0.749 [95% confidence interval (95% CI), 0.736-0.762]) compared with other models. Therefore, XGBoost was used to develop models using fewer variables. The XGBoost model with the eight features (in order of importance: mean uterine artery pulsatility index in the early second trimester, chronic hypertension, pregestational diabetes, uterine artery notch, systolic and diastolic blood pressure in the first trimester, body mass index, and maternal age) was chosen as the final model as it had an AUC of 0.741 (95% CI, 0.730-0.752) which was not inferior to the original model (p = 0.58). The final model in the validation dataset had an AUC of 0.779 (95% CI, 0.722-0.831). An online application of the final model was developed ( https://kawakita.shinyapps.io/Preterm_preeclampsia/ ).ML algorithms using information available before 23 weeks can accurately predict preterm preeclampsia before 37 weeks. · Prediction models using uterine artery Doppler have not been adopted in the US.. · We developed a model using an ML algorithm.. · An online application of the final model was developed.. · ML algorithms using information available before 23 weeks can accurately predict preterm preeclampsia before 37 weeks..","PeriodicalId":7584,"journal":{"name":"American journal of perinatology","volume":" ","pages":"1354-1361"},"PeriodicalIF":1.5000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of perinatology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-2495-3600","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/4 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

This study aimed to develop machine learning (ML) models for predicting preterm preeclampsia using the information available before 23 weeks gestation.This was a secondary analysis of the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nuMoM2b) cohort. We considered 131 features available before 23 weeks including maternal demographics, obstetrics and family history, social determinants of health, physical activity, nutrition, and early second-trimester ultrasound. Our primary outcome was preterm preeclampsia before 37 weeks. The dataset was randomly split into a training set (70%) and a validation set (30%). ML models using glmnet, multilayer perceptron, random forest, XGBoost (extreme gradient boosting), and LightGBM models were developed. Using the ML approach that achieved the best area under the curve (AUC), we developed the final model. Further feature selection was conducted from the top 25 important features based on SHapley Additive exPlanations (SHAP) values. The performance of the final model was assessed using the validation dataset.Of 9,467 individuals, 219 (2.3%) had preterm preeclampsia. The AUC of the XGBoost model was the highest (AUC = 0.749 [95% confidence interval (95% CI), 0.736-0.762]) compared with other models. Therefore, XGBoost was used to develop models using fewer variables. The XGBoost model with the eight features (in order of importance: mean uterine artery pulsatility index in the early second trimester, chronic hypertension, pregestational diabetes, uterine artery notch, systolic and diastolic blood pressure in the first trimester, body mass index, and maternal age) was chosen as the final model as it had an AUC of 0.741 (95% CI, 0.730-0.752) which was not inferior to the original model (p = 0.58). The final model in the validation dataset had an AUC of 0.779 (95% CI, 0.722-0.831). An online application of the final model was developed ( https://kawakita.shinyapps.io/Preterm_preeclampsia/ ).ML algorithms using information available before 23 weeks can accurately predict preterm preeclampsia before 37 weeks. · Prediction models using uterine artery Doppler have not been adopted in the US.. · We developed a model using an ML algorithm.. · An online application of the final model was developed.. · ML algorithms using information available before 23 weeks can accurately predict preterm preeclampsia before 37 weeks..

查看原文本刊更多论文

使用机器学习算法预测早产子痫前期的推导和验证。

目的：利用妊娠23周前的信息，建立预测早产子痫前期的机器学习模型。研究设计：这是对“未产妊娠结局研究：监测准妈妈队列”的二次分析。我们考虑了23周之前的131个特征，包括产妇人口统计、产科和家族史、健康的社会决定因素、身体活动、营养和早期中期妊娠超声。我们的主要结局是37周前的早产先兆子痫。数据集随机分为训练集（70%）和验证集（30%）。利用glmnet、多层感知器、随机森林、XGBoost和LightGBM模型开发了机器学习模型。使用机器学习方法实现最佳曲线下面积（AUC），我们开发了最终模型。根据SHapley Additive explanation值从前25个重要特征中进行进一步的特征选择。使用验证数据集评估最终模型的性能。结果：9467例患者中，219例（2.3%）有早产子痫前期。与其他模型相比，XGBoost模型的AUC最高，AUC为0.749[95%置信区间0.736-0.762]。因此，使用XGBoost开发使用较少变量的模型。最终选择具有8个特征的XGBoost模型（按重要性依次为：妊娠中期早期子宫动脉平均脉搏指数、慢性高血压、妊娠前期糖尿病、子宫动脉切迹、妊娠早期收缩压和舒张压、体重指数、产妇年龄），其AUC为0.741(95%置信区间0.730-0.752)，不低于原始模型（P=0.58）。验证数据集中最终模型的AUC为0.779（95%置信区间为0.722-0.831）。开发了最终模型的在线应用程序（https://kawakita.shinyapps.io/Preterm_preeclampsia/）。结论：利用23周前可用信息的机器学习算法可以准确预测37周前的早产子痫前期。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

American journal of perinatology 医学-妇产科学

CiteScore

5.90

自引率

0.00%

发文量

302

审稿时长

4-8 weeks

期刊介绍： The American Journal of Perinatology is an international, peer-reviewed, and indexed journal publishing 14 issues a year dealing with original research and topical reviews. It is the definitive forum for specialists in obstetrics, neonatology, perinatology, and maternal/fetal medicine, with emphasis on bridging the different fields. The focus is primarily on clinical and translational research, clinical and technical advances in diagnosis, monitoring, and treatment as well as evidence-based reviews. Topics of interest include epidemiology, diagnosis, prevention, and management of maternal, fetal, and neonatal diseases. Manuscripts on new technology, NICU set-ups, and nursing topics are published to provide a broad survey of important issues in this field. All articles undergo rigorous peer review, with web-based submission, expedited turn-around, and availability of electronic publication. The American Journal of Perinatology is accompanied by AJP Reports - an Open Access journal for case reports in neonatology and maternal/fetal medicine.