Priyanka Chaurasia, Pratheepan Yogarajah, Abbas Ali Mahdi, Sally McClean, Mohammad Kaleem Ahmad, Tabrez Jafar, Sanjay Kumar Singh
{"title":"Machine learning and explainable artificial intelligence to predict and interpret lead toxicity in pregnant women and unborn baby.","authors":"Priyanka Chaurasia, Pratheepan Yogarajah, Abbas Ali Mahdi, Sally McClean, Mohammad Kaleem Ahmad, Tabrez Jafar, Sanjay Kumar Singh","doi":"10.3389/fdgth.2025.1608949","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Lead toxicity is a well-recognised environmental health issue, with prenatal exposure posing significant risks to infants. One major pathway of exposure to infants is maternal lead transfer during pregnancy. Therefore, accurately characterising maternal lead levels is critical for enabling targeted and personalised healthcare interventions. Current detection methods for lead poisoning are based on laboratory blood tests, which are not feasible for the screening of a wide population due to cost, accessibility, and logistical constraints. To address this limitation, our previous research proposed a novel machine learning (ML)-based model that predicts lead exposure levels in pregnant women using sociodemographic data alone. However, for such predictive models to gain broader acceptance, especially in clinical and public health settings, transparency and interpretability are essential.</p><p><strong>Methods: </strong>Understanding the reasoning behind the predictions of the model is crucial to building trust and facilitating informed decision-making. In this study, we present the first application of an explainable artificial intelligence (XAI) framework to interpret predictions made by our ML-based lead exposure model.</p><p><strong>Results: </strong>Using a dataset of 200 blood samples and 12 sociodemographic features, a Random Forest classifier was trained, achieving an accuracy of 84.52%.</p><p><strong>Discussion: </strong>We applied two widely used XAI methods, SHAP (SHapley additive explanations) and LIME (Local Interpretable Model-Agnostic Explanations), to provide insight into how each input feature contributed to the model's predictions.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"7 ","pages":"1608949"},"PeriodicalIF":3.2000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12162601/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdgth.2025.1608949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Lead toxicity is a well-recognised environmental health issue, with prenatal exposure posing significant risks to infants. One major pathway of exposure to infants is maternal lead transfer during pregnancy. Therefore, accurately characterising maternal lead levels is critical for enabling targeted and personalised healthcare interventions. Current detection methods for lead poisoning are based on laboratory blood tests, which are not feasible for the screening of a wide population due to cost, accessibility, and logistical constraints. To address this limitation, our previous research proposed a novel machine learning (ML)-based model that predicts lead exposure levels in pregnant women using sociodemographic data alone. However, for such predictive models to gain broader acceptance, especially in clinical and public health settings, transparency and interpretability are essential.
Methods: Understanding the reasoning behind the predictions of the model is crucial to building trust and facilitating informed decision-making. In this study, we present the first application of an explainable artificial intelligence (XAI) framework to interpret predictions made by our ML-based lead exposure model.
Results: Using a dataset of 200 blood samples and 12 sociodemographic features, a Random Forest classifier was trained, achieving an accuracy of 84.52%.
Discussion: We applied two widely used XAI methods, SHAP (SHapley additive explanations) and LIME (Local Interpretable Model-Agnostic Explanations), to provide insight into how each input feature contributed to the model's predictions.