Xi Bai, Zhibo Zhou, Zeyan Zheng, Yansheng Li, Kejia Liu, Yuanjun Zheng, Hongbo Yang, Huijuan Zhu, Shi Chen, Hui Pan
{"title":"开发和评估机器学习模型,用于预测孕前暴露于辐射的妇女的巨大胎龄新生儿。","authors":"Xi Bai, Zhibo Zhou, Zeyan Zheng, Yansheng Li, Kejia Liu, Yuanjun Zheng, Hongbo Yang, Huijuan Zhu, Shi Chen, Hui Pan","doi":"10.1186/s12911-024-02556-6","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The correlation between radiation exposure before pregnancy and abnormal birth weight has been previously proven. However, for large-for-gestational-age (LGA) babies in women exposed to radiation before becoming pregnant, there is no prediction model yet.</p><p><strong>Material and methods: </strong>The data were collected from the National Free Preconception Health Examination Project in China. A sum of 455 neonates (42 SGA births and 423 non-LGA births) were included. A training set (n = 319) and a test set (n = 136) were created from the dataset at random. To develop prediction models for LGA neonates, conventional logistic regression (LR) method and six machine learning methods were used in this study. Recursive feature elimination approach was performed by choosing 10 features which made a big contribution to the prediction models. And the Shapley Additive Explanation model was applied to interpret the most important characteristics that affected forecast outputs.</p><p><strong>Results: </strong>The random forest (RF) model had the highest average area under the receiver-operating-characteristic curve (AUC) for predicting LGA in the test set (0.843, 95% confidence interval [CI]: 0.714-0.974). Except for the logistic regression model (AUC: 0.603, 95%CI: 0.440-0.767), other models' AUCs displayed well. Thereinto, the RF algorithm's final prediction model using 10 characteristics achieved an average AUC of 0.821 (95% CI: 0.693-0.949).</p><p><strong>Conclusion: </strong>The prediction model based on machine learning might be a promising tool for the prenatal prediction of LGA births in women with radiation exposure before pregnancy.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"174"},"PeriodicalIF":3.3000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11188254/pdf/","citationCount":"0","resultStr":"{\"title\":\"Development and evaluation of machine learning models for predicting large-for-gestational-age newborns in women exposed to radiation prior to pregnancy.\",\"authors\":\"Xi Bai, Zhibo Zhou, Zeyan Zheng, Yansheng Li, Kejia Liu, Yuanjun Zheng, Hongbo Yang, Huijuan Zhu, Shi Chen, Hui Pan\",\"doi\":\"10.1186/s12911-024-02556-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>The correlation between radiation exposure before pregnancy and abnormal birth weight has been previously proven. However, for large-for-gestational-age (LGA) babies in women exposed to radiation before becoming pregnant, there is no prediction model yet.</p><p><strong>Material and methods: </strong>The data were collected from the National Free Preconception Health Examination Project in China. A sum of 455 neonates (42 SGA births and 423 non-LGA births) were included. A training set (n = 319) and a test set (n = 136) were created from the dataset at random. To develop prediction models for LGA neonates, conventional logistic regression (LR) method and six machine learning methods were used in this study. Recursive feature elimination approach was performed by choosing 10 features which made a big contribution to the prediction models. And the Shapley Additive Explanation model was applied to interpret the most important characteristics that affected forecast outputs.</p><p><strong>Results: </strong>The random forest (RF) model had the highest average area under the receiver-operating-characteristic curve (AUC) for predicting LGA in the test set (0.843, 95% confidence interval [CI]: 0.714-0.974). Except for the logistic regression model (AUC: 0.603, 95%CI: 0.440-0.767), other models' AUCs displayed well. Thereinto, the RF algorithm's final prediction model using 10 characteristics achieved an average AUC of 0.821 (95% CI: 0.693-0.949).</p><p><strong>Conclusion: </strong>The prediction model based on machine learning might be a promising tool for the prenatal prediction of LGA births in women with radiation exposure before pregnancy.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"24 1\",\"pages\":\"174\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11188254/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-024-02556-6\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02556-6","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
Development and evaluation of machine learning models for predicting large-for-gestational-age newborns in women exposed to radiation prior to pregnancy.
Introduction: The correlation between radiation exposure before pregnancy and abnormal birth weight has been previously proven. However, for large-for-gestational-age (LGA) babies in women exposed to radiation before becoming pregnant, there is no prediction model yet.
Material and methods: The data were collected from the National Free Preconception Health Examination Project in China. A sum of 455 neonates (42 SGA births and 423 non-LGA births) were included. A training set (n = 319) and a test set (n = 136) were created from the dataset at random. To develop prediction models for LGA neonates, conventional logistic regression (LR) method and six machine learning methods were used in this study. Recursive feature elimination approach was performed by choosing 10 features which made a big contribution to the prediction models. And the Shapley Additive Explanation model was applied to interpret the most important characteristics that affected forecast outputs.
Results: The random forest (RF) model had the highest average area under the receiver-operating-characteristic curve (AUC) for predicting LGA in the test set (0.843, 95% confidence interval [CI]: 0.714-0.974). Except for the logistic regression model (AUC: 0.603, 95%CI: 0.440-0.767), other models' AUCs displayed well. Thereinto, the RF algorithm's final prediction model using 10 characteristics achieved an average AUC of 0.821 (95% CI: 0.693-0.949).
Conclusion: The prediction model based on machine learning might be a promising tool for the prenatal prediction of LGA births in women with radiation exposure before pregnancy.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.