Xiaoyi Zhang, Muhammad Usman, Ateeq ur Rehman Irshad, Mudassar Rashid, Amira Khattak
{"title":"Investigating Spatial Effects through Machine Learning and Leveraging Explainable AI for Child Malnutrition in Pakistan","authors":"Xiaoyi Zhang, Muhammad Usman, Ateeq ur Rehman Irshad, Mudassar Rashid, Amira Khattak","doi":"10.3390/ijgi13090330","DOIUrl":null,"url":null,"abstract":"While socioeconomic gradients in regional health inequalities are firmly established, the synergistic interactions between socioeconomic deprivation and climate vulnerability within convenient proximity and neighbourhood locations with health disparities remain poorly explored and thus require deep understanding within a regional context. Furthermore, disregarding the importance of spatial spillover effects and nonlinear effects of covariates on childhood stunting are inevitable in dealing with an enduring issue of regional health inequalities. The present study aims to investigate the spatial inequalities in childhood stunting at the district level in Pakistan and validate the importance of spatial lag in predicting childhood stunting. Furthermore, it examines the presence of any nonlinear relationships among the selected independent features with childhood stunting. The study utilized data related to socioeconomic features from MICS 2017–2018 and climatic data from Integrated Contextual Analysis. A multi-model approach was employed to address the research questions, which included Ordinary Least Squares Regression (OLS), various Spatial Models, Machine Learning Algorithms and Explainable Artificial Intelligence methods. Firstly, OLS was used to analyse and test the linear relationships among selected variables. Secondly, Spatial Durbin Error Model (SDEM) was used to detect and capture the impact of spatial spillover on childhood stunting. Third, XGBoost and Random Forest machine learning algorithms were employed to examine and validate the importance of the spatial lag component. Finally, EXAI methods such as SHapley were utilized to identify potential nonlinear relationships. The study found a clear pattern of spatial clustering and geographical disparities in childhood stunting, with multidimensional poverty, high climate vulnerability and early marriage worsening childhood stunting. In contrast, low climate vulnerability, high exposure to mass media and high women’s literacy were found to reduce childhood stunting. The use of machine learning algorithms, specifically XGBoost and Random Forest, highlighted the significant role played by the average value in the neighbourhood in predicting childhood stunting in nearby districts, confirming that the spatial spillover effect is not bounded by geographical boundaries. Furthermore, EXAI methods such as partial dependency plot reveal the existence of a nonlinear relationship between multidimensional poverty and childhood stunting. The study’s findings provide valuable insights into the spatial distribution of childhood stunting in Pakistan, emphasizing the importance of considering spatial effects in predicting childhood stunting. Individual and household-level factors such as exposure to mass media and women’s literacy have shown positive implications for childhood stunting. It further provides a justification for the usage of EXAI methods to draw better insights and propose customised intervention policies accordingly.","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.3390/ijgi13090330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0
Abstract
While socioeconomic gradients in regional health inequalities are firmly established, the synergistic interactions between socioeconomic deprivation and climate vulnerability within convenient proximity and neighbourhood locations with health disparities remain poorly explored and thus require deep understanding within a regional context. Furthermore, disregarding the importance of spatial spillover effects and nonlinear effects of covariates on childhood stunting are inevitable in dealing with an enduring issue of regional health inequalities. The present study aims to investigate the spatial inequalities in childhood stunting at the district level in Pakistan and validate the importance of spatial lag in predicting childhood stunting. Furthermore, it examines the presence of any nonlinear relationships among the selected independent features with childhood stunting. The study utilized data related to socioeconomic features from MICS 2017–2018 and climatic data from Integrated Contextual Analysis. A multi-model approach was employed to address the research questions, which included Ordinary Least Squares Regression (OLS), various Spatial Models, Machine Learning Algorithms and Explainable Artificial Intelligence methods. Firstly, OLS was used to analyse and test the linear relationships among selected variables. Secondly, Spatial Durbin Error Model (SDEM) was used to detect and capture the impact of spatial spillover on childhood stunting. Third, XGBoost and Random Forest machine learning algorithms were employed to examine and validate the importance of the spatial lag component. Finally, EXAI methods such as SHapley were utilized to identify potential nonlinear relationships. The study found a clear pattern of spatial clustering and geographical disparities in childhood stunting, with multidimensional poverty, high climate vulnerability and early marriage worsening childhood stunting. In contrast, low climate vulnerability, high exposure to mass media and high women’s literacy were found to reduce childhood stunting. The use of machine learning algorithms, specifically XGBoost and Random Forest, highlighted the significant role played by the average value in the neighbourhood in predicting childhood stunting in nearby districts, confirming that the spatial spillover effect is not bounded by geographical boundaries. Furthermore, EXAI methods such as partial dependency plot reveal the existence of a nonlinear relationship between multidimensional poverty and childhood stunting. The study’s findings provide valuable insights into the spatial distribution of childhood stunting in Pakistan, emphasizing the importance of considering spatial effects in predicting childhood stunting. Individual and household-level factors such as exposure to mass media and women’s literacy have shown positive implications for childhood stunting. It further provides a justification for the usage of EXAI methods to draw better insights and propose customised intervention policies accordingly.