{"title":"Predicting child mortality determinants in Uttar Pradesh using Machine Learning: Insights from the National Family and Health Survey (2019–21)","authors":"Pinky Pandey , Sacheendra Shukla , Niraj Kumar Singh , Mukesh Kumar","doi":"10.1016/j.cegh.2025.101949","DOIUrl":null,"url":null,"abstract":"<div><h3>Aim</h3><div>This study aimed to delineate spatial variations in under-five mortality across Uttar Pradesh and evaluate the efficacy of various machine learning algorithms in identifying critical determinants influencing these mortality rates.</div></div><div><h3>Methods</h3><div>The study utilized data from the National Family and Health Survey (NFHS) - V. Four machine learning algorithms—Random Forests, Logistic Regression, K-Nearest Neighbors (KNN), and Naive Bayes—were applied alongside a traditional logistic regression model. Predictive performance was evaluated using metrics such as model accuracy and receiver operating characteristic (ROC) curves. Descriptive analysis highlighted regional variations in under-five mortality rates.</div></div><div><h3>Results</h3><div>Notable regional disparities in under-five mortality were observed across Uttar Pradesh. Predictive accuracies ranged from 76 % to 79.4 %, with the logistic regression model achieving the highest accuracy (79.4 %). All ML models demonstrated comparable predictive capabilities. The most effective model identified key determinants of under-five mortality, including breastfeeding status, number of births in the preceding five years, child's gender, birth intervals, antenatal care, birth order, type of water source, and maternal body mass index.</div></div><div><h3>Conclusion</h3><div>Machine learning models provide valuable insights into the determinants of under-five mortality, with the logistic regression model demonstrating superior predictive performance. Policy measures targeting critical factors, such as promoting breastfeeding, optimizing birth intervals, and improving maternal health and antenatal care, can significantly enhance childhood survival rates in Uttar Pradesh.</div></div>","PeriodicalId":46404,"journal":{"name":"Clinical Epidemiology and Global Health","volume":"32 ","pages":"Article 101949"},"PeriodicalIF":2.3000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Epidemiology and Global Health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213398425000387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Aim
This study aimed to delineate spatial variations in under-five mortality across Uttar Pradesh and evaluate the efficacy of various machine learning algorithms in identifying critical determinants influencing these mortality rates.
Methods
The study utilized data from the National Family and Health Survey (NFHS) - V. Four machine learning algorithms—Random Forests, Logistic Regression, K-Nearest Neighbors (KNN), and Naive Bayes—were applied alongside a traditional logistic regression model. Predictive performance was evaluated using metrics such as model accuracy and receiver operating characteristic (ROC) curves. Descriptive analysis highlighted regional variations in under-five mortality rates.
Results
Notable regional disparities in under-five mortality were observed across Uttar Pradesh. Predictive accuracies ranged from 76 % to 79.4 %, with the logistic regression model achieving the highest accuracy (79.4 %). All ML models demonstrated comparable predictive capabilities. The most effective model identified key determinants of under-five mortality, including breastfeeding status, number of births in the preceding five years, child's gender, birth intervals, antenatal care, birth order, type of water source, and maternal body mass index.
Conclusion
Machine learning models provide valuable insights into the determinants of under-five mortality, with the logistic regression model demonstrating superior predictive performance. Policy measures targeting critical factors, such as promoting breastfeeding, optimizing birth intervals, and improving maternal health and antenatal care, can significantly enhance childhood survival rates in Uttar Pradesh.
期刊介绍:
Clinical Epidemiology and Global Health (CEGH) is a multidisciplinary journal and it is published four times (March, June, September, December) a year. The mandate of CEGH is to promote articles on clinical epidemiology with focus on developing countries in the context of global health. We also accept articles from other countries. It publishes original research work across all disciplines of medicine and allied sciences, related to clinical epidemiology and global health. The journal publishes Original articles, Review articles, Evidence Summaries, Letters to the Editor. All articles published in CEGH are peer-reviewed and published online for immediate access and citation.