调查印度地下水污染物与高血压风险之间的关系:基于机器学习的分析。

IF 4.7 3区 医学 Q2 ENVIRONMENTAL SCIENCES
Sourav Biswas, Aparajita Chattopadhyay, Kathrin Schilling, Ayushi Das
{"title":"调查印度地下水污染物与高血压风险之间的关系:基于机器学习的分析。","authors":"Sourav Biswas, Aparajita Chattopadhyay, Kathrin Schilling, Ayushi Das","doi":"10.1038/s41370-025-00776-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>One-fourth of Indians are hypertensive, and the majority relies on groundwater for drinking. But the role of groundwater physicochemical properties and contamination in hypertension remains understudied.</p><p><strong>Objective: </strong>The study investigates the association between physicochemical groundwater characteristics andcontaminants and hypertension risk in India.</p><p><strong>Data: </strong>This study used data from the fifth round of the National Family Health Survey (NFHS-5 collected 2019-2021), including health, socio-demographics, and food and dietary information (n = 712,666 individuals). The physicochemical characteristics of groundwater data were derived from the Central Groundwater Board (CGWB, 2019-2021). This groundwater data from raster maps was linked to NFHS-5 records using cluster shapefiles and merging them with individual records via cluster IDs.</p><p><strong>Methods: </strong>Bivariate and multivariable regressions were used to identify factors associated with hypertension at the individual level. Moran's I statistics, Local Indicator of Spatial Association (LISA) cluster maps, and the Spatial Error Model (SEM) were used at district levels to investigate the spatial association. Machine learning models, including Artificial Neural Networks (ANN), Random Forest and Extreme Gradient Boosting (XGBoost), were used to predict hypertension risk zones.</p><p><strong>Results: </strong>Physicochemical drinking water composition is a key factor in hypertension risk. Elevated groundwater pH (>8.5, Adjusted Odds Ratio (AOR): 2.12), electrical conductivity (>300 μS/cm, AOR: 1.06), sulphate (>200 mg/L,  AOR: 1.16), arsenic (>0.01 mg/L, AOR: 1.09), nitrate (>45 mg/L, AOR: 1.07), and magnesium (>30 mg/L, AOR: 1.03) are associated to higher odds of hypertension. The Random Forest model demonstrated the highest predictive performance, with a coefficient of determination (R²) of 0.9970, mean absolute error (MAE) of 0.0012, and mean squared error (MSE) of 0.0077. It effectively identified high-risk zones in the northwestern (Delhi, Punjab, Haryana, and Rajasthan) and eastern (West Bengal and Bihar) regions of India.</p><p><strong>Impact: </strong>This study highlights how important groundwater quality is in determining the incidence of hypertension, pointing to groundwater physicochemical properties and contaminants such as electrical conductivity, sulphate, arsenic, nitrate, and magnesium as essential factors. Our research is the first of its kind to comprehensively map hypertension risk zones using machine learning models and geospatial analysis. The findings highlight that water quality is a modifiable risk factor, reinforcing the need for improved drinking water supply systems, regular water quality testing, and targeted interventions in high-risk regions. This study emphasizes the importance of intersectoral collaborations to enhance public health outcomes.</p>","PeriodicalId":15684,"journal":{"name":"Journal of Exposure Science and Environmental Epidemiology","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigating the association between groundwater contaminants and hypertension risk in India: a machine learning-based analysis.\",\"authors\":\"Sourav Biswas, Aparajita Chattopadhyay, Kathrin Schilling, Ayushi Das\",\"doi\":\"10.1038/s41370-025-00776-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>One-fourth of Indians are hypertensive, and the majority relies on groundwater for drinking. But the role of groundwater physicochemical properties and contamination in hypertension remains understudied.</p><p><strong>Objective: </strong>The study investigates the association between physicochemical groundwater characteristics andcontaminants and hypertension risk in India.</p><p><strong>Data: </strong>This study used data from the fifth round of the National Family Health Survey (NFHS-5 collected 2019-2021), including health, socio-demographics, and food and dietary information (n = 712,666 individuals). The physicochemical characteristics of groundwater data were derived from the Central Groundwater Board (CGWB, 2019-2021). This groundwater data from raster maps was linked to NFHS-5 records using cluster shapefiles and merging them with individual records via cluster IDs.</p><p><strong>Methods: </strong>Bivariate and multivariable regressions were used to identify factors associated with hypertension at the individual level. Moran's I statistics, Local Indicator of Spatial Association (LISA) cluster maps, and the Spatial Error Model (SEM) were used at district levels to investigate the spatial association. Machine learning models, including Artificial Neural Networks (ANN), Random Forest and Extreme Gradient Boosting (XGBoost), were used to predict hypertension risk zones.</p><p><strong>Results: </strong>Physicochemical drinking water composition is a key factor in hypertension risk. Elevated groundwater pH (>8.5, Adjusted Odds Ratio (AOR): 2.12), electrical conductivity (>300 μS/cm, AOR: 1.06), sulphate (>200 mg/L,  AOR: 1.16), arsenic (>0.01 mg/L, AOR: 1.09), nitrate (>45 mg/L, AOR: 1.07), and magnesium (>30 mg/L, AOR: 1.03) are associated to higher odds of hypertension. The Random Forest model demonstrated the highest predictive performance, with a coefficient of determination (R²) of 0.9970, mean absolute error (MAE) of 0.0012, and mean squared error (MSE) of 0.0077. It effectively identified high-risk zones in the northwestern (Delhi, Punjab, Haryana, and Rajasthan) and eastern (West Bengal and Bihar) regions of India.</p><p><strong>Impact: </strong>This study highlights how important groundwater quality is in determining the incidence of hypertension, pointing to groundwater physicochemical properties and contaminants such as electrical conductivity, sulphate, arsenic, nitrate, and magnesium as essential factors. Our research is the first of its kind to comprehensively map hypertension risk zones using machine learning models and geospatial analysis. The findings highlight that water quality is a modifiable risk factor, reinforcing the need for improved drinking water supply systems, regular water quality testing, and targeted interventions in high-risk regions. This study emphasizes the importance of intersectoral collaborations to enhance public health outcomes.</p>\",\"PeriodicalId\":15684,\"journal\":{\"name\":\"Journal of Exposure Science and Environmental Epidemiology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2025-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Exposure Science and Environmental Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1038/s41370-025-00776-0\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Exposure Science and Environmental Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41370-025-00776-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

背景:四分之一的印度人患有高血压,大多数人依赖地下水饮用。但地下水理化性质和污染在高血压中的作用仍未得到充分研究。目的:研究印度地下水理化特征与污染物和高血压风险的关系。数据:本研究使用了来自第五轮全国家庭健康调查(NFHS-5,收集于2019-2021年)的数据,包括健康、社会人口统计学以及食品和饮食信息(n = 712,666人)。地下水的物理化学特征数据来源于中央地下水局(CGWB, 2019-2021)。这些来自栅格地图的地下水数据使用簇形文件与NFHS-5记录相关联,并通过簇id将它们与单个记录合并。方法:采用双变量和多变量回归在个体水平上确定与高血压相关的因素。Moran’s I统计数据、LISA聚类图和空间误差模型(SEM)在地区层面上对空间关联进行了研究。采用人工神经网络(ANN)、随机森林(Random Forest)和极端梯度增强(Extreme Gradient Boosting)等机器学习模型预测高血压危险区。结果:饮用水理化成分是高血压发病的关键因素。升高的地下水pH值(>8.5,调整优势比(AOR): 2.12)、电导率(>300 μS/cm, AOR: 1.06)、硫酸盐(>200 mg/L, AOR: 1.16)、砷(>0.01 mg/L, AOR: 1.09)、硝酸盐(>45 mg/L, AOR: 1.07)和镁(>30 mg/L, AOR: 1.03)与高血压的高发病率相关。随机森林模型的预测效果最好,决定系数(R²)为0.9970,平均绝对误差(MAE)为0.0012,均方误差(MSE)为0.0077。它有效地确定了印度西北部(德里、旁遮普邦、哈里亚纳邦和拉贾斯坦邦)和东部(西孟加拉邦和比哈尔邦)地区的高风险地区。影响:本研究强调了地下水质量在决定高血压发病率方面的重要性,指出地下水的理化性质和电导率、硫酸盐、砷、硝酸盐和镁等污染物是必不可少的因素。我们的研究是第一个使用机器学习模型和地理空间分析全面绘制高血压危险区的研究。研究结果强调,水质是一个可改变的风险因素,因此需要改善饮用水供应系统,定期进行水质检测,并在高风险地区采取有针对性的干预措施。这项研究强调了部门间合作对提高公共卫生成果的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Investigating the association between groundwater contaminants and hypertension risk in India: a machine learning-based analysis.

Background: One-fourth of Indians are hypertensive, and the majority relies on groundwater for drinking. But the role of groundwater physicochemical properties and contamination in hypertension remains understudied.

Objective: The study investigates the association between physicochemical groundwater characteristics andcontaminants and hypertension risk in India.

Data: This study used data from the fifth round of the National Family Health Survey (NFHS-5 collected 2019-2021), including health, socio-demographics, and food and dietary information (n = 712,666 individuals). The physicochemical characteristics of groundwater data were derived from the Central Groundwater Board (CGWB, 2019-2021). This groundwater data from raster maps was linked to NFHS-5 records using cluster shapefiles and merging them with individual records via cluster IDs.

Methods: Bivariate and multivariable regressions were used to identify factors associated with hypertension at the individual level. Moran's I statistics, Local Indicator of Spatial Association (LISA) cluster maps, and the Spatial Error Model (SEM) were used at district levels to investigate the spatial association. Machine learning models, including Artificial Neural Networks (ANN), Random Forest and Extreme Gradient Boosting (XGBoost), were used to predict hypertension risk zones.

Results: Physicochemical drinking water composition is a key factor in hypertension risk. Elevated groundwater pH (>8.5, Adjusted Odds Ratio (AOR): 2.12), electrical conductivity (>300 μS/cm, AOR: 1.06), sulphate (>200 mg/L,  AOR: 1.16), arsenic (>0.01 mg/L, AOR: 1.09), nitrate (>45 mg/L, AOR: 1.07), and magnesium (>30 mg/L, AOR: 1.03) are associated to higher odds of hypertension. The Random Forest model demonstrated the highest predictive performance, with a coefficient of determination (R²) of 0.9970, mean absolute error (MAE) of 0.0012, and mean squared error (MSE) of 0.0077. It effectively identified high-risk zones in the northwestern (Delhi, Punjab, Haryana, and Rajasthan) and eastern (West Bengal and Bihar) regions of India.

Impact: This study highlights how important groundwater quality is in determining the incidence of hypertension, pointing to groundwater physicochemical properties and contaminants such as electrical conductivity, sulphate, arsenic, nitrate, and magnesium as essential factors. Our research is the first of its kind to comprehensively map hypertension risk zones using machine learning models and geospatial analysis. The findings highlight that water quality is a modifiable risk factor, reinforcing the need for improved drinking water supply systems, regular water quality testing, and targeted interventions in high-risk regions. This study emphasizes the importance of intersectoral collaborations to enhance public health outcomes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.90
自引率
6.70%
发文量
93
审稿时长
3 months
期刊介绍: Journal of Exposure Science and Environmental Epidemiology (JESEE) aims to be the premier and authoritative source of information on advances in exposure science for professionals in a wide range of environmental and public health disciplines. JESEE publishes original peer-reviewed research presenting significant advances in exposure science and exposure analysis, including development and application of the latest technologies for measuring exposures, and innovative computational approaches for translating novel data streams to characterize and predict exposures. The types of papers published in the research section of JESEE are original research articles, translation studies, and correspondence. Reported results should further understanding of the relationship between environmental exposure and human health, describe evaluated novel exposure science tools, or demonstrate potential of exposure science to enable decisions and actions that promote and protect human health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信