Machine learning approach in predicting early antenatal care initiation at first trimester among reproductive women in Somalia: an analysis with SHAP explanations
{"title":"Machine learning approach in predicting early antenatal care initiation at first trimester among reproductive women in Somalia: an analysis with SHAP explanations","authors":"Jamilu Sani , Mohamed Mustaf Ahmed","doi":"10.1016/j.ibmed.2025.100252","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>Timely antenatal care (ANC) initiation is essential for maternal and neonatal health, enabling the early detection of risks and ensuring optimal care. In Somalia, delayed initiation of ANC poses a significant health risk. This study applied machine learning (ML) models to predict early ANC initiation among Somali women and identify key predictors using SHapley Additive exPlanations (SHAP).</div></div><div><h3>Methods</h3><div>Data from the 2020 Somali Health and Demographic Survey were analyzed, focusing on ANC timing in 3138 women aged 15–49. Six ML models (Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, K-Nearest Neighbors, and XGBoost) were assessed for accuracy, precision, recall, F1-score, and AUROC. Feature importance was evaluated using SHAP to interpret the influence of each predictor.</div></div><div><h3>Results</h3><div>Random Forest achieved the highest performance, with an accuracy of 70 %, precision of 0.69, recall of 0.71, and AUROC of 0.74, closely followed by XGBoost, which achieved an accuracy of 69 % and AUROC of 0.72. SHAP analysis identified the place of delivery, residence, and age group as the most influential predictors of early ANC initiation, with the number of births in the past five years showing a significant negative impact.</div></div><div><h3>Conclusion</h3><div>Machine learning models, particularly Random Forest and XGBoost, effectively predicted early ANC initiation, highlighting significant demographic and healthcare access-related predictors. These findings suggest targeted interventions focusing on delivery location preferences, residential factors, and age-specific approaches to improve early ANC attendance in Somalia.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100252"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521225000560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction
Timely antenatal care (ANC) initiation is essential for maternal and neonatal health, enabling the early detection of risks and ensuring optimal care. In Somalia, delayed initiation of ANC poses a significant health risk. This study applied machine learning (ML) models to predict early ANC initiation among Somali women and identify key predictors using SHapley Additive exPlanations (SHAP).
Methods
Data from the 2020 Somali Health and Demographic Survey were analyzed, focusing on ANC timing in 3138 women aged 15–49. Six ML models (Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, K-Nearest Neighbors, and XGBoost) were assessed for accuracy, precision, recall, F1-score, and AUROC. Feature importance was evaluated using SHAP to interpret the influence of each predictor.
Results
Random Forest achieved the highest performance, with an accuracy of 70 %, precision of 0.69, recall of 0.71, and AUROC of 0.74, closely followed by XGBoost, which achieved an accuracy of 69 % and AUROC of 0.72. SHAP analysis identified the place of delivery, residence, and age group as the most influential predictors of early ANC initiation, with the number of births in the past five years showing a significant negative impact.
Conclusion
Machine learning models, particularly Random Forest and XGBoost, effectively predicted early ANC initiation, highlighting significant demographic and healthcare access-related predictors. These findings suggest targeted interventions focusing on delivery location preferences, residential factors, and age-specific approaches to improve early ANC attendance in Somalia.