{"title":"基于机器学习的低出生体重预测及其相关风险因素:来自2022年孟加拉国人口与健康调查的见解。","authors":"Nourin Sultana, Zeba Afia, Isteaq Kabir Sifat, Shamsuz Zoha, Tajin Ahmed Jisa, Md Kaderi Kibria","doi":"10.1371/journal.pgph.0005187","DOIUrl":null,"url":null,"abstract":"<p><p>Low birth weight (LBW) is a major public health concern particularly in low and middle-income countries as it contributes to increased infant mortality and long-term health complications. This study applies and evaluates machine learning (ML) algorithms to predict LBW and identify its key risk factors in Bangladesh. Data were collected from 3,192 complete records of ever-married women aged 15-49 years from the Bangladesh Demographic and Health Survey, 2022. Risk factors for LBW were identified by four feature selection techniques including Boruta-based selection (BFS), LASSO regression, Elastic Net and Random Forest (RF). Six ML algorithms, including Logistic Regression (LR), RF, Decision Tree (DT), Artificial Neural Networks (ANN), Extreme Gradient Boosting (XGB), and Light Gradient Boosting Machine (LGBM) were performed to predict LBW. Model performance was evaluated using accuracy, precision, recall, F1-score, AUC, and ROC analysis. SHAP values were utilized to examine the influence of individual features on the model's prediction. The prevalence of LBW in Bangladesh was 27.8%. Twelve features were identified and the XGB model outperformed the other models by achieving the highest performance in predicting LBW with an accuracy of 80% and area under the curve of 0.761 in holdout (90:10) cross-validation. SHAP analysis revealed that 'pregnancy duration' and 'division' were the strongest predictors of LBW risk followed by 'marriage to first birth interval' 'ANC visits' 'C-section' and 'place of delivery'. These findings demonstrate that XGB can serve as an effective tool for predicting LBW and identifying important risk factors that may guide targeted interventions. The insights generated from this study can support public health strategies aimed at reducing LBW prevalence in Bangladesh.</p>","PeriodicalId":74466,"journal":{"name":"PLOS global public health","volume":"5 9","pages":"e0005187"},"PeriodicalIF":2.5000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12483264/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine learning based prediction of low birth weight and its associated risk factors: Insights from the Bangladesh Demographic and Health Survey 2022.\",\"authors\":\"Nourin Sultana, Zeba Afia, Isteaq Kabir Sifat, Shamsuz Zoha, Tajin Ahmed Jisa, Md Kaderi Kibria\",\"doi\":\"10.1371/journal.pgph.0005187\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Low birth weight (LBW) is a major public health concern particularly in low and middle-income countries as it contributes to increased infant mortality and long-term health complications. This study applies and evaluates machine learning (ML) algorithms to predict LBW and identify its key risk factors in Bangladesh. Data were collected from 3,192 complete records of ever-married women aged 15-49 years from the Bangladesh Demographic and Health Survey, 2022. Risk factors for LBW were identified by four feature selection techniques including Boruta-based selection (BFS), LASSO regression, Elastic Net and Random Forest (RF). Six ML algorithms, including Logistic Regression (LR), RF, Decision Tree (DT), Artificial Neural Networks (ANN), Extreme Gradient Boosting (XGB), and Light Gradient Boosting Machine (LGBM) were performed to predict LBW. Model performance was evaluated using accuracy, precision, recall, F1-score, AUC, and ROC analysis. SHAP values were utilized to examine the influence of individual features on the model's prediction. The prevalence of LBW in Bangladesh was 27.8%. Twelve features were identified and the XGB model outperformed the other models by achieving the highest performance in predicting LBW with an accuracy of 80% and area under the curve of 0.761 in holdout (90:10) cross-validation. SHAP analysis revealed that 'pregnancy duration' and 'division' were the strongest predictors of LBW risk followed by 'marriage to first birth interval' 'ANC visits' 'C-section' and 'place of delivery'. These findings demonstrate that XGB can serve as an effective tool for predicting LBW and identifying important risk factors that may guide targeted interventions. The insights generated from this study can support public health strategies aimed at reducing LBW prevalence in Bangladesh.</p>\",\"PeriodicalId\":74466,\"journal\":{\"name\":\"PLOS global public health\",\"volume\":\"5 9\",\"pages\":\"e0005187\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12483264/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLOS global public health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pgph.0005187\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLOS global public health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1371/journal.pgph.0005187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
Machine learning based prediction of low birth weight and its associated risk factors: Insights from the Bangladesh Demographic and Health Survey 2022.
Low birth weight (LBW) is a major public health concern particularly in low and middle-income countries as it contributes to increased infant mortality and long-term health complications. This study applies and evaluates machine learning (ML) algorithms to predict LBW and identify its key risk factors in Bangladesh. Data were collected from 3,192 complete records of ever-married women aged 15-49 years from the Bangladesh Demographic and Health Survey, 2022. Risk factors for LBW were identified by four feature selection techniques including Boruta-based selection (BFS), LASSO regression, Elastic Net and Random Forest (RF). Six ML algorithms, including Logistic Regression (LR), RF, Decision Tree (DT), Artificial Neural Networks (ANN), Extreme Gradient Boosting (XGB), and Light Gradient Boosting Machine (LGBM) were performed to predict LBW. Model performance was evaluated using accuracy, precision, recall, F1-score, AUC, and ROC analysis. SHAP values were utilized to examine the influence of individual features on the model's prediction. The prevalence of LBW in Bangladesh was 27.8%. Twelve features were identified and the XGB model outperformed the other models by achieving the highest performance in predicting LBW with an accuracy of 80% and area under the curve of 0.761 in holdout (90:10) cross-validation. SHAP analysis revealed that 'pregnancy duration' and 'division' were the strongest predictors of LBW risk followed by 'marriage to first birth interval' 'ANC visits' 'C-section' and 'place of delivery'. These findings demonstrate that XGB can serve as an effective tool for predicting LBW and identifying important risk factors that may guide targeted interventions. The insights generated from this study can support public health strategies aimed at reducing LBW prevalence in Bangladesh.