Xiaoxue Yang, Xuewu Song, Kun Yang, Peng Gao, Shuai Wang, Simin Zhang, Rong Qiang, Zhibin Li, Xinru Gao
{"title":"Prediction of spontaneous preterm birth in pregnant women using machine learning.","authors":"Xiaoxue Yang, Xuewu Song, Kun Yang, Peng Gao, Shuai Wang, Simin Zhang, Rong Qiang, Zhibin Li, Xinru Gao","doi":"10.1007/s00404-025-08117-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Spontaneous preterm birth (sPTB) is a significant global health concern, contributing to adverse outcomes for both pregnant women and newborns. Early identification of women with risk of sPTB is essential for mitigating these negative effects and improving maternal and neonatal health outcomes. The aim of this study is to explore the feasibility of using machine learning to predict sPTB risk and to analyze the contribution of variables.</p><p><strong>Methods: </strong>All data were collected retrospectively. Prediction models were developed using eight different machine learning algorithms combined with six variable selection methods. The models' predictive performance was evaluated using area under the receiver operating characteristic curve (AUROC), area under the precision recall curve (AUPRC), accuracy, sensitivity, F1-score, positive predictive value, and negative predictive value.</p><p><strong>Results: </strong>A total of 1122 pregnant women, of whom 187 had preterm birth and 935 had term birth, were enrolled. The model by combining the categorical boosting algorithm and backward elimination had the best predictive performance with the highest AUROC (0.8762) and AUPRC (0.7061), and the Brier score was 0.12 on the test set. The top 5 variables for predicting sPTB risk in this study were free triiodothyronine, albumin/globulin, thyroglobulin antibody, total thyroxine, red cell volume distribution width.</p><p><strong>Conclusions: </strong>The machine learning model may help identify pregnant women at high risk of sPTB, and individual risk factor analysis could provide reference for clinical decision. However, as some key variables are not part of routine laboratory tests during pregnancy worldwide, the model's generalizability and clinical applicability require further study.</p>","PeriodicalId":8330,"journal":{"name":"Archives of Gynecology and Obstetrics","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Gynecology and Obstetrics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00404-025-08117-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Spontaneous preterm birth (sPTB) is a significant global health concern, contributing to adverse outcomes for both pregnant women and newborns. Early identification of women with risk of sPTB is essential for mitigating these negative effects and improving maternal and neonatal health outcomes. The aim of this study is to explore the feasibility of using machine learning to predict sPTB risk and to analyze the contribution of variables.
Methods: All data were collected retrospectively. Prediction models were developed using eight different machine learning algorithms combined with six variable selection methods. The models' predictive performance was evaluated using area under the receiver operating characteristic curve (AUROC), area under the precision recall curve (AUPRC), accuracy, sensitivity, F1-score, positive predictive value, and negative predictive value.
Results: A total of 1122 pregnant women, of whom 187 had preterm birth and 935 had term birth, were enrolled. The model by combining the categorical boosting algorithm and backward elimination had the best predictive performance with the highest AUROC (0.8762) and AUPRC (0.7061), and the Brier score was 0.12 on the test set. The top 5 variables for predicting sPTB risk in this study were free triiodothyronine, albumin/globulin, thyroglobulin antibody, total thyroxine, red cell volume distribution width.
Conclusions: The machine learning model may help identify pregnant women at high risk of sPTB, and individual risk factor analysis could provide reference for clinical decision. However, as some key variables are not part of routine laboratory tests during pregnancy worldwide, the model's generalizability and clinical applicability require further study.
期刊介绍:
Founded in 1870 as "Archiv für Gynaekologie", Archives of Gynecology and Obstetrics has a long and outstanding tradition. Since 1922 the journal has been the Organ of the Deutsche Gesellschaft für Gynäkologie und Geburtshilfe. "The Archives of Gynecology and Obstetrics" is circulated in over 40 countries world wide and is indexed in "PubMed/Medline" and "Science Citation Index Expanded/Journal Citation Report".
The journal publishes invited and submitted reviews; peer-reviewed original articles about clinical topics and basic research as well as news and views and guidelines and position statements from all sub-specialties in gynecology and obstetrics.