Sourav Biswas, Aparajita Chattopadhyay, Kathrin Schilling, Ayushi Das
{"title":"调查印度地下水污染物与高血压风险之间的关系:基于机器学习的分析。","authors":"Sourav Biswas, Aparajita Chattopadhyay, Kathrin Schilling, Ayushi Das","doi":"10.1038/s41370-025-00776-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>One-fourth of Indians are hypertensive, and the majority relies on groundwater for drinking. But the role of groundwater physicochemical properties and contamination in hypertension remains understudied.</p><p><strong>Objective: </strong>The study investigates the association between physicochemical groundwater characteristics andcontaminants and hypertension risk in India.</p><p><strong>Data: </strong>This study used data from the fifth round of the National Family Health Survey (NFHS-5 collected 2019-2021), including health, socio-demographics, and food and dietary information (n = 712,666 individuals). The physicochemical characteristics of groundwater data were derived from the Central Groundwater Board (CGWB, 2019-2021). This groundwater data from raster maps was linked to NFHS-5 records using cluster shapefiles and merging them with individual records via cluster IDs.</p><p><strong>Methods: </strong>Bivariate and multivariable regressions were used to identify factors associated with hypertension at the individual level. Moran's I statistics, Local Indicator of Spatial Association (LISA) cluster maps, and the Spatial Error Model (SEM) were used at district levels to investigate the spatial association. Machine learning models, including Artificial Neural Networks (ANN), Random Forest and Extreme Gradient Boosting (XGBoost), were used to predict hypertension risk zones.</p><p><strong>Results: </strong>Physicochemical drinking water composition is a key factor in hypertension risk. Elevated groundwater pH (>8.5, Adjusted Odds Ratio (AOR): 2.12), electrical conductivity (>300 μS/cm, AOR: 1.06), sulphate (>200 mg/L, AOR: 1.16), arsenic (>0.01 mg/L, AOR: 1.09), nitrate (>45 mg/L, AOR: 1.07), and magnesium (>30 mg/L, AOR: 1.03) are associated to higher odds of hypertension. The Random Forest model demonstrated the highest predictive performance, with a coefficient of determination (R²) of 0.9970, mean absolute error (MAE) of 0.0012, and mean squared error (MSE) of 0.0077. It effectively identified high-risk zones in the northwestern (Delhi, Punjab, Haryana, and Rajasthan) and eastern (West Bengal and Bihar) regions of India.</p><p><strong>Impact: </strong>This study highlights how important groundwater quality is in determining the incidence of hypertension, pointing to groundwater physicochemical properties and contaminants such as electrical conductivity, sulphate, arsenic, nitrate, and magnesium as essential factors. Our research is the first of its kind to comprehensively map hypertension risk zones using machine learning models and geospatial analysis. The findings highlight that water quality is a modifiable risk factor, reinforcing the need for improved drinking water supply systems, regular water quality testing, and targeted interventions in high-risk regions. This study emphasizes the importance of intersectoral collaborations to enhance public health outcomes.</p>","PeriodicalId":15684,"journal":{"name":"Journal of Exposure Science and Environmental Epidemiology","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigating the association between groundwater contaminants and hypertension risk in India: a machine learning-based analysis.\",\"authors\":\"Sourav Biswas, Aparajita Chattopadhyay, Kathrin Schilling, Ayushi Das\",\"doi\":\"10.1038/s41370-025-00776-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>One-fourth of Indians are hypertensive, and the majority relies on groundwater for drinking. But the role of groundwater physicochemical properties and contamination in hypertension remains understudied.</p><p><strong>Objective: </strong>The study investigates the association between physicochemical groundwater characteristics andcontaminants and hypertension risk in India.</p><p><strong>Data: </strong>This study used data from the fifth round of the National Family Health Survey (NFHS-5 collected 2019-2021), including health, socio-demographics, and food and dietary information (n = 712,666 individuals). The physicochemical characteristics of groundwater data were derived from the Central Groundwater Board (CGWB, 2019-2021). This groundwater data from raster maps was linked to NFHS-5 records using cluster shapefiles and merging them with individual records via cluster IDs.</p><p><strong>Methods: </strong>Bivariate and multivariable regressions were used to identify factors associated with hypertension at the individual level. Moran's I statistics, Local Indicator of Spatial Association (LISA) cluster maps, and the Spatial Error Model (SEM) were used at district levels to investigate the spatial association. Machine learning models, including Artificial Neural Networks (ANN), Random Forest and Extreme Gradient Boosting (XGBoost), were used to predict hypertension risk zones.</p><p><strong>Results: </strong>Physicochemical drinking water composition is a key factor in hypertension risk. Elevated groundwater pH (>8.5, Adjusted Odds Ratio (AOR): 2.12), electrical conductivity (>300 μS/cm, AOR: 1.06), sulphate (>200 mg/L, AOR: 1.16), arsenic (>0.01 mg/L, AOR: 1.09), nitrate (>45 mg/L, AOR: 1.07), and magnesium (>30 mg/L, AOR: 1.03) are associated to higher odds of hypertension. The Random Forest model demonstrated the highest predictive performance, with a coefficient of determination (R²) of 0.9970, mean absolute error (MAE) of 0.0012, and mean squared error (MSE) of 0.0077. It effectively identified high-risk zones in the northwestern (Delhi, Punjab, Haryana, and Rajasthan) and eastern (West Bengal and Bihar) regions of India.</p><p><strong>Impact: </strong>This study highlights how important groundwater quality is in determining the incidence of hypertension, pointing to groundwater physicochemical properties and contaminants such as electrical conductivity, sulphate, arsenic, nitrate, and magnesium as essential factors. Our research is the first of its kind to comprehensively map hypertension risk zones using machine learning models and geospatial analysis. The findings highlight that water quality is a modifiable risk factor, reinforcing the need for improved drinking water supply systems, regular water quality testing, and targeted interventions in high-risk regions. This study emphasizes the importance of intersectoral collaborations to enhance public health outcomes.</p>\",\"PeriodicalId\":15684,\"journal\":{\"name\":\"Journal of Exposure Science and Environmental Epidemiology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2025-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Exposure Science and Environmental Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1038/s41370-025-00776-0\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Exposure Science and Environmental Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41370-025-00776-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Investigating the association between groundwater contaminants and hypertension risk in India: a machine learning-based analysis.
Background: One-fourth of Indians are hypertensive, and the majority relies on groundwater for drinking. But the role of groundwater physicochemical properties and contamination in hypertension remains understudied.
Objective: The study investigates the association between physicochemical groundwater characteristics andcontaminants and hypertension risk in India.
Data: This study used data from the fifth round of the National Family Health Survey (NFHS-5 collected 2019-2021), including health, socio-demographics, and food and dietary information (n = 712,666 individuals). The physicochemical characteristics of groundwater data were derived from the Central Groundwater Board (CGWB, 2019-2021). This groundwater data from raster maps was linked to NFHS-5 records using cluster shapefiles and merging them with individual records via cluster IDs.
Methods: Bivariate and multivariable regressions were used to identify factors associated with hypertension at the individual level. Moran's I statistics, Local Indicator of Spatial Association (LISA) cluster maps, and the Spatial Error Model (SEM) were used at district levels to investigate the spatial association. Machine learning models, including Artificial Neural Networks (ANN), Random Forest and Extreme Gradient Boosting (XGBoost), were used to predict hypertension risk zones.
Results: Physicochemical drinking water composition is a key factor in hypertension risk. Elevated groundwater pH (>8.5, Adjusted Odds Ratio (AOR): 2.12), electrical conductivity (>300 μS/cm, AOR: 1.06), sulphate (>200 mg/L, AOR: 1.16), arsenic (>0.01 mg/L, AOR: 1.09), nitrate (>45 mg/L, AOR: 1.07), and magnesium (>30 mg/L, AOR: 1.03) are associated to higher odds of hypertension. The Random Forest model demonstrated the highest predictive performance, with a coefficient of determination (R²) of 0.9970, mean absolute error (MAE) of 0.0012, and mean squared error (MSE) of 0.0077. It effectively identified high-risk zones in the northwestern (Delhi, Punjab, Haryana, and Rajasthan) and eastern (West Bengal and Bihar) regions of India.
Impact: This study highlights how important groundwater quality is in determining the incidence of hypertension, pointing to groundwater physicochemical properties and contaminants such as electrical conductivity, sulphate, arsenic, nitrate, and magnesium as essential factors. Our research is the first of its kind to comprehensively map hypertension risk zones using machine learning models and geospatial analysis. The findings highlight that water quality is a modifiable risk factor, reinforcing the need for improved drinking water supply systems, regular water quality testing, and targeted interventions in high-risk regions. This study emphasizes the importance of intersectoral collaborations to enhance public health outcomes.
期刊介绍:
Journal of Exposure Science and Environmental Epidemiology (JESEE) aims to be the premier and authoritative source of information on advances in exposure science for professionals in a wide range of environmental and public health disciplines.
JESEE publishes original peer-reviewed research presenting significant advances in exposure science and exposure analysis, including development and application of the latest technologies for measuring exposures, and innovative computational approaches for translating novel data streams to characterize and predict exposures. The types of papers published in the research section of JESEE are original research articles, translation studies, and correspondence. Reported results should further understanding of the relationship between environmental exposure and human health, describe evaluated novel exposure science tools, or demonstrate potential of exposure science to enable decisions and actions that promote and protect human health.