{"title":"Machine learning model-based preterm birth prediction and clinical nomogram: A big retrospective cohort study.","authors":"Ya Liu, Jiangling Liu, Heqing Shen","doi":"10.1002/ijgo.16036","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study sought to develop a multifactorial predictive model for preterm birth risk, with the goal of providing clinical practitioners with early prevention.</p><p><strong>Methods: </strong>This retrospective cohort study utilized 2022 and 2018 National Vital Statistics System (NVSS) birth data, with the 2022 cohort arbitrarily split into training (70%) and internal verification (30%) subsets, and the 2018 cohort for external validation. Four machine learning algorithms-logistic regression, adaptive lasso regression, bootstrap forest, and boosted trees-identified features associated with preterm birth. The study then integrated the consensus features identified across the four models to construct a logistic regression-based preterm birth prediction nomogram. To evaluate the model's efficacy, calibration, receiver operating characteristic (ROC), and decision curve analysis were applied to both the internal and external validation sets.</p><p><strong>Results: </strong>The study included 2 567 040 mother-infant pairs from the 2022 cohort and 2 688 568 mother-infant pairs from the 2018 cohort. All four machine learning models demonstrated high accuracy (area under the curve [AUC] >0.7) in predicting preterm birth, and the internal validation results indicated good model generalizability. Feature selection identified nine common risk factors associated with preterm birth. The prediction nomogram based on these nine common features achieved AUCs of 0.701, 0.702, and 0.704 in the training, internal validation, and external validation sets, respectively. The calibration curves showed good agreement, and the decision curve analysis confirmed the model's net clinical benefits.</p><p><strong>Conclusion: </strong>This study developed a reliable preterm birth prediction tool using large-scale birth cohort data, filling the gap of lacking external validation for existing preterm birth prediction models.</p>","PeriodicalId":14164,"journal":{"name":"International Journal of Gynecology & Obstetrics","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Gynecology & Obstetrics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/ijgo.16036","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: This study sought to develop a multifactorial predictive model for preterm birth risk, with the goal of providing clinical practitioners with early prevention.
Methods: This retrospective cohort study utilized 2022 and 2018 National Vital Statistics System (NVSS) birth data, with the 2022 cohort arbitrarily split into training (70%) and internal verification (30%) subsets, and the 2018 cohort for external validation. Four machine learning algorithms-logistic regression, adaptive lasso regression, bootstrap forest, and boosted trees-identified features associated with preterm birth. The study then integrated the consensus features identified across the four models to construct a logistic regression-based preterm birth prediction nomogram. To evaluate the model's efficacy, calibration, receiver operating characteristic (ROC), and decision curve analysis were applied to both the internal and external validation sets.
Results: The study included 2 567 040 mother-infant pairs from the 2022 cohort and 2 688 568 mother-infant pairs from the 2018 cohort. All four machine learning models demonstrated high accuracy (area under the curve [AUC] >0.7) in predicting preterm birth, and the internal validation results indicated good model generalizability. Feature selection identified nine common risk factors associated with preterm birth. The prediction nomogram based on these nine common features achieved AUCs of 0.701, 0.702, and 0.704 in the training, internal validation, and external validation sets, respectively. The calibration curves showed good agreement, and the decision curve analysis confirmed the model's net clinical benefits.
Conclusion: This study developed a reliable preterm birth prediction tool using large-scale birth cohort data, filling the gap of lacking external validation for existing preterm birth prediction models.
期刊介绍:
The International Journal of Gynecology & Obstetrics publishes articles on all aspects of basic and clinical research in the fields of obstetrics and gynecology and related subjects, with emphasis on matters of worldwide interest.