Advancing predictive analytics in child malnutrition: Machine, ensemble and deep learning models with balanced class distribution for early detection of stunting and wasting
Wisdom Richard Mgomezulu , Paul Thangata , Bertha Mkandawire , Nana Amoah
{"title":"Advancing predictive analytics in child malnutrition: Machine, ensemble and deep learning models with balanced class distribution for early detection of stunting and wasting","authors":"Wisdom Richard Mgomezulu , Paul Thangata , Bertha Mkandawire , Nana Amoah","doi":"10.1016/j.hnm.2025.200340","DOIUrl":null,"url":null,"abstract":"<div><div>Child malnutrition remains a critical public health challenge in sub-Saharan Africa, with traditional surveillance methods proving inadequate for early detection and intervention. This study leverages advanced machine learning and deep learning techniques to revolutionize stunting and wasting prediction in Malawi, utilizing nationally representative World Bank's Living Standards Measurement Surveys (LSMS) data to develop robust predictive models capable of identifying at-risk children before clinical manifestations emerge. Seven classification algorithms were evaluated, including ensemble methods (Random Forest, XGBoost), Deep Neural Networks (DNN), and traditional approaches (SVM, Logistic Regression, KNN, Gradient Boosting). Class imbalance challenges were addressed through SMOTE implementation and strategic class weighting. Model performance was assessed using accuracy, precision, recall, F1-score, and AUC-ROC metrics across balanced datasets. Results demonstrate exceptional predictive capabilities, with Random Forest achieving perfect performance for wasting prediction (100 % accuracy, precision, recall, F1-score, and AUC-ROC) and near-perfect stunting classification (99.98 % accuracy). XGBoost demonstrated comparable excellence with 99.49 % accuracy for wasting and 95.52 % for stunting prediction. DNN showed strong performance (91.50 % wasting accuracy, 76.64 % stunting accuracy), while traditional methods exhibited moderate effectiveness, with logistic regression achieving the lowest performance (66.58 % wasting, 64.72 % stunting accuracy). These findings represent a paradigm shift toward proactive nutritional surveillance, enabling early identification of vulnerable populations through data-driven approaches. The superior performance of ensemble algorithms provides policymakers with powerful tools for evidence-based resource allocation and targeted interventions. Implementation of these predictive models within Malawi's health systems could significantly enhance early detection capabilities, facilitate timely nutritional interventions, and contribute substantially to achieving global nutrition targets while reducing childhood mortality rates.</div></div>","PeriodicalId":36125,"journal":{"name":"Human Nutrition and Metabolism","volume":"42 ","pages":"Article 200340"},"PeriodicalIF":1.8000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Nutrition and Metabolism","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666149725000441","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0
Abstract
Child malnutrition remains a critical public health challenge in sub-Saharan Africa, with traditional surveillance methods proving inadequate for early detection and intervention. This study leverages advanced machine learning and deep learning techniques to revolutionize stunting and wasting prediction in Malawi, utilizing nationally representative World Bank's Living Standards Measurement Surveys (LSMS) data to develop robust predictive models capable of identifying at-risk children before clinical manifestations emerge. Seven classification algorithms were evaluated, including ensemble methods (Random Forest, XGBoost), Deep Neural Networks (DNN), and traditional approaches (SVM, Logistic Regression, KNN, Gradient Boosting). Class imbalance challenges were addressed through SMOTE implementation and strategic class weighting. Model performance was assessed using accuracy, precision, recall, F1-score, and AUC-ROC metrics across balanced datasets. Results demonstrate exceptional predictive capabilities, with Random Forest achieving perfect performance for wasting prediction (100 % accuracy, precision, recall, F1-score, and AUC-ROC) and near-perfect stunting classification (99.98 % accuracy). XGBoost demonstrated comparable excellence with 99.49 % accuracy for wasting and 95.52 % for stunting prediction. DNN showed strong performance (91.50 % wasting accuracy, 76.64 % stunting accuracy), while traditional methods exhibited moderate effectiveness, with logistic regression achieving the lowest performance (66.58 % wasting, 64.72 % stunting accuracy). These findings represent a paradigm shift toward proactive nutritional surveillance, enabling early identification of vulnerable populations through data-driven approaches. The superior performance of ensemble algorithms provides policymakers with powerful tools for evidence-based resource allocation and targeted interventions. Implementation of these predictive models within Malawi's health systems could significantly enhance early detection capabilities, facilitate timely nutritional interventions, and contribute substantially to achieving global nutrition targets while reducing childhood mortality rates.