基于机器学习-SHAP 算法的全球海洋溶解镉分布模型。

IF 8.2 1区 环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES
Science of the Total Environment Pub Date : 2025-01-01 Epub Date: 2024-12-12 DOI:10.1016/j.scitotenv.2024.177951
Ziyuan Jiang, Enhui Liao, Ziang Li, Ruifeng Zhang
{"title":"基于机器学习-SHAP 算法的全球海洋溶解镉分布模型。","authors":"Ziyuan Jiang, Enhui Liao, Ziang Li, Ruifeng Zhang","doi":"10.1016/j.scitotenv.2024.177951","DOIUrl":null,"url":null,"abstract":"<p><p>Cadmium (Cd) is a bio-essential trace metal in the ocean that can be toxic at high concentrations, significantly impacting the marine environment and phytoplankton growth. Its distribution pattern is closely proportional to that of phosphate (PO<sub>4</sub>), although the mechanism is not fully understood. At low concentrations, evidence indicates Cd is able to act as an enzyme cofactor in biological processes. An understanding of the spatial distribution of dissolved cadmium (dCd) remains lacking and is constrained by the limitations of current observational data. Based on the observational data, this study applied advanced machine learning methods to reconstruct a global dataset of dCd, aiming to improve the accuracy and comprehensiveness of dCd cycling analyses. A comparison of five machine learning algorithms (artificial neural network, support vector machine, Lasso regression, k-nearest neighbors, and random forest) found that the random forest model showed the best performance (Rsq = 0.99, RMSE = 0.035 nmol kg<sup>-1</sup>, MAE = 0.019 nmol kg<sup>-1</sup>, MAPE = 0.345), reducing bias by 25 % compared to previous studies. Using SHapley Additive exPlanations approach (SHAP), this study explored the factors influencing the dCd distribution at various depths and discussed the potential causes of changes in the Cd-PO<sub>4</sub> relationship. The results showed that the temporal and spatial variability of Cd was influenced by surface biological processes, deep-sea mineralization, and seawater stratification. Variations in the Cd-PO<sub>4</sub> relationship were linked to differences in biological fractionation inside and outside high-nutrient, low-chlorophyll (HNLC) regions, as well as the mixing of water masses with different Cd:PO<sub>4</sub> ratios. Further analysis indicated that >80 % of particles degraded into Cd and PO<sub>4</sub> were produced in HNLC regions. This study highlights the broad potential of machine learning in oceanography, offering a global perspective on Cd cycling and new insights into the mechanisms driving element cycling.</p>","PeriodicalId":422,"journal":{"name":"Science of the Total Environment","volume":"958 ","pages":"177951"},"PeriodicalIF":8.2000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Modeling the global ocean distribution of dissolved cadmium based on machine learning-SHAP algorithm.\",\"authors\":\"Ziyuan Jiang, Enhui Liao, Ziang Li, Ruifeng Zhang\",\"doi\":\"10.1016/j.scitotenv.2024.177951\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Cadmium (Cd) is a bio-essential trace metal in the ocean that can be toxic at high concentrations, significantly impacting the marine environment and phytoplankton growth. Its distribution pattern is closely proportional to that of phosphate (PO<sub>4</sub>), although the mechanism is not fully understood. At low concentrations, evidence indicates Cd is able to act as an enzyme cofactor in biological processes. An understanding of the spatial distribution of dissolved cadmium (dCd) remains lacking and is constrained by the limitations of current observational data. Based on the observational data, this study applied advanced machine learning methods to reconstruct a global dataset of dCd, aiming to improve the accuracy and comprehensiveness of dCd cycling analyses. A comparison of five machine learning algorithms (artificial neural network, support vector machine, Lasso regression, k-nearest neighbors, and random forest) found that the random forest model showed the best performance (Rsq = 0.99, RMSE = 0.035 nmol kg<sup>-1</sup>, MAE = 0.019 nmol kg<sup>-1</sup>, MAPE = 0.345), reducing bias by 25 % compared to previous studies. Using SHapley Additive exPlanations approach (SHAP), this study explored the factors influencing the dCd distribution at various depths and discussed the potential causes of changes in the Cd-PO<sub>4</sub> relationship. The results showed that the temporal and spatial variability of Cd was influenced by surface biological processes, deep-sea mineralization, and seawater stratification. Variations in the Cd-PO<sub>4</sub> relationship were linked to differences in biological fractionation inside and outside high-nutrient, low-chlorophyll (HNLC) regions, as well as the mixing of water masses with different Cd:PO<sub>4</sub> ratios. Further analysis indicated that >80 % of particles degraded into Cd and PO<sub>4</sub> were produced in HNLC regions. This study highlights the broad potential of machine learning in oceanography, offering a global perspective on Cd cycling and new insights into the mechanisms driving element cycling.</p>\",\"PeriodicalId\":422,\"journal\":{\"name\":\"Science of the Total Environment\",\"volume\":\"958 \",\"pages\":\"177951\"},\"PeriodicalIF\":8.2000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science of the Total Environment\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1016/j.scitotenv.2024.177951\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of the Total Environment","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.scitotenv.2024.177951","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/12 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

镉(Cd)是海洋中一种生物必需的微量金属,高浓度时可能有毒,严重影响海洋环境和浮游植物的生长。其分布模式与磷酸(PO4)的分布模式密切相关,但其机制尚不完全清楚。在低浓度下,有证据表明Cd能够在生物过程中作为酶辅助因子。对溶解镉(dCd)的空间分布的理解仍然缺乏,并且受到当前观测数据的限制。本研究在观测数据的基础上,运用先进的机器学习方法重建全球dCd数据集,旨在提高dCd循环分析的准确性和全面性。对比5种机器学习算法(人工神经网络、支持向量机、Lasso回归、k近邻和随机森林)发现,随机森林模型表现最佳(Rsq = 0.99, RMSE = 0.035 nmol kg-1, MAE = 0.019 nmol kg-1, MAPE = 0.345),比以往的研究减少了25%的偏差。利用SHapley加性解释方法(SHAP),探讨了影响不同深度dCd分布的因素,并探讨了Cd-PO4关系变化的潜在原因。结果表明,Cd的时空变异受表层生物过程、深海矿化和海水分层的影响。Cd-PO4关系的变化与高营养低叶绿素区内外生物分异以及不同Cd:PO4比水团的混合有关。进一步的分析表明,bbb80 %的颗粒降解成Cd和PO4是在HNLC地区产生的。这项研究强调了机器学习在海洋学中的广泛潜力,为Cd循环提供了全球视角,并为驱动元素循环的机制提供了新的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Modeling the global ocean distribution of dissolved cadmium based on machine learning-SHAP algorithm.

Cadmium (Cd) is a bio-essential trace metal in the ocean that can be toxic at high concentrations, significantly impacting the marine environment and phytoplankton growth. Its distribution pattern is closely proportional to that of phosphate (PO4), although the mechanism is not fully understood. At low concentrations, evidence indicates Cd is able to act as an enzyme cofactor in biological processes. An understanding of the spatial distribution of dissolved cadmium (dCd) remains lacking and is constrained by the limitations of current observational data. Based on the observational data, this study applied advanced machine learning methods to reconstruct a global dataset of dCd, aiming to improve the accuracy and comprehensiveness of dCd cycling analyses. A comparison of five machine learning algorithms (artificial neural network, support vector machine, Lasso regression, k-nearest neighbors, and random forest) found that the random forest model showed the best performance (Rsq = 0.99, RMSE = 0.035 nmol kg-1, MAE = 0.019 nmol kg-1, MAPE = 0.345), reducing bias by 25 % compared to previous studies. Using SHapley Additive exPlanations approach (SHAP), this study explored the factors influencing the dCd distribution at various depths and discussed the potential causes of changes in the Cd-PO4 relationship. The results showed that the temporal and spatial variability of Cd was influenced by surface biological processes, deep-sea mineralization, and seawater stratification. Variations in the Cd-PO4 relationship were linked to differences in biological fractionation inside and outside high-nutrient, low-chlorophyll (HNLC) regions, as well as the mixing of water masses with different Cd:PO4 ratios. Further analysis indicated that >80 % of particles degraded into Cd and PO4 were produced in HNLC regions. This study highlights the broad potential of machine learning in oceanography, offering a global perspective on Cd cycling and new insights into the mechanisms driving element cycling.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Science of the Total Environment
Science of the Total Environment 环境科学-环境科学
CiteScore
17.60
自引率
10.20%
发文量
8726
审稿时长
2.4 months
期刊介绍: The Science of the Total Environment is an international journal dedicated to scientific research on the environment and its interaction with humanity. It covers a wide range of disciplines and seeks to publish innovative, hypothesis-driven, and impactful research that explores the entire environment, including the atmosphere, lithosphere, hydrosphere, biosphere, and anthroposphere. The journal's updated Aims & Scope emphasizes the importance of interdisciplinary environmental research with broad impact. Priority is given to studies that advance fundamental understanding and explore the interconnectedness of multiple environmental spheres. Field studies are preferred, while laboratory experiments must demonstrate significant methodological advancements or mechanistic insights with direct relevance to the environment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信