Machine learning-based prediction of optimal antenatal care utilization among reproductive women in Nigeria

IF 4.9
Jamilu Sani , Adeyemi Oluwagbemiga , Mohamed Mustaf Ahmed
{"title":"Machine learning-based prediction of optimal antenatal care utilization among reproductive women in Nigeria","authors":"Jamilu Sani ,&nbsp;Adeyemi Oluwagbemiga ,&nbsp;Mohamed Mustaf Ahmed","doi":"10.1016/j.mlwa.2025.100698","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Despite global efforts, disparities in antenatal care (ANC) utilization persist in Nigeria, where maternal mortality remains alarmingly high (1047 deaths per 100,000 live births). Traditional statistical models often fall short in identifying complex non-linear relationships in population health data. Machine learning (ML) offers a promising alternative that uncovers hidden patterns and improves prediction accuracy.</div></div><div><h3>Methods</h3><div>This study used data from the 2018 Nigeria Demographic and Health Survey (NDHS), a nationally representative data set. After data preprocessing and feature selection, six supervised ML algorithms—Logistic Regression, Support Vector Machine, K-Nearest Neighbors, Decision Tree, Random Forest, and XGBoost—were applied using Python 3.9. The model performance was evaluated using accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUROC). Feature importance was assessed using permutation importance and Gini impurity score.</div></div><div><h3>Results</h3><div>Among all models, Random Forest achieved the best performance, with 90 % accuracy, 0.90 precision and recall, an F1-score of 0.91, and an AUROC of 0.90. Permutation and Gini importance analyses identified the place of delivery, region, residence, and educational level as the most influential predictors. Other moderately important features included distance to health facilities, husband’s occupation, number of births, and healthcare decision-making autonomy—factors not highlighted by traditional statistical approaches.</div></div><div><h3>Conclusion</h3><div>Machine learning, particularly Random Forest, demonstrated strong predictive power in identifying the key determinants of ANC utilization. These findings highlight the potential of ML to inform targeted maternal health interventions and improve outcomes in low-resource settings, such as Nigeria.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"21 ","pages":"Article 100698"},"PeriodicalIF":4.9000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827025000817","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Despite global efforts, disparities in antenatal care (ANC) utilization persist in Nigeria, where maternal mortality remains alarmingly high (1047 deaths per 100,000 live births). Traditional statistical models often fall short in identifying complex non-linear relationships in population health data. Machine learning (ML) offers a promising alternative that uncovers hidden patterns and improves prediction accuracy.

Methods

This study used data from the 2018 Nigeria Demographic and Health Survey (NDHS), a nationally representative data set. After data preprocessing and feature selection, six supervised ML algorithms—Logistic Regression, Support Vector Machine, K-Nearest Neighbors, Decision Tree, Random Forest, and XGBoost—were applied using Python 3.9. The model performance was evaluated using accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUROC). Feature importance was assessed using permutation importance and Gini impurity score.

Results

Among all models, Random Forest achieved the best performance, with 90 % accuracy, 0.90 precision and recall, an F1-score of 0.91, and an AUROC of 0.90. Permutation and Gini importance analyses identified the place of delivery, region, residence, and educational level as the most influential predictors. Other moderately important features included distance to health facilities, husband’s occupation, number of births, and healthcare decision-making autonomy—factors not highlighted by traditional statistical approaches.

Conclusion

Machine learning, particularly Random Forest, demonstrated strong predictive power in identifying the key determinants of ANC utilization. These findings highlight the potential of ML to inform targeted maternal health interventions and improve outcomes in low-resource settings, such as Nigeria.
基于机器学习的尼日利亚育龄妇女最佳产前护理利用预测
尽管全球都在努力,但尼日利亚在产前保健(ANC)利用方面的差距仍然存在,孕产妇死亡率仍然高得惊人(每10万活产1047例死亡)。传统的统计模型往往无法识别人口健康数据中复杂的非线性关系。机器学习(ML)提供了一个很有前途的替代方案,可以发现隐藏的模式并提高预测的准确性。方法本研究使用了2018年尼日利亚人口与健康调查(NDHS)的数据,这是一个具有全国代表性的数据集。在数据预处理和特征选择后,使用Python 3.9应用逻辑回归、支持向量机、k近邻、决策树、随机森林和xgboost六种监督式机器学习算法。使用准确度、精密度、召回率、f1评分和受试者工作特征曲线下面积(AUROC)来评估模型的性能。使用排列重要性和基尼杂质评分评估特征重要性。结果在所有模型中,Random Forest的准确率为90%,精密度和召回率为0.90,f1得分为0.91,AUROC为0.90。排列和基尼重要性分析发现,分娩地点、地区、居住地和教育水平是最具影响力的预测因素。其他中等重要的特征包括与卫生设施的距离、丈夫的职业、生育数量和医疗决策自主权——这些因素在传统的统计方法中没有得到强调。结论:机器学习,特别是随机森林,在识别ANC使用的关键决定因素方面表现出很强的预测能力。这些发现突出了机器学习在尼日利亚等资源匮乏地区为有针对性的孕产妇保健干预措施提供信息和改善结果方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Machine learning with applications
Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications
自引率
0.00%
发文量
0
审稿时长
98 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信