Emerging investigator series: predicted losses of sulfur and selenium in european soils using machine learning: a call for prudent model interrogation and selection†
Gerrad D. Jones, Logan Insinga, Boris Droz, Aryeh Feinberg, Andrea Stenke, Jo Smith, Pete Smith and Lenny H. E. Winkel
{"title":"Emerging investigator series: predicted losses of sulfur and selenium in european soils using machine learning: a call for prudent model interrogation and selection†","authors":"Gerrad D. Jones, Logan Insinga, Boris Droz, Aryeh Feinberg, Andrea Stenke, Jo Smith, Pete Smith and Lenny H. E. Winkel","doi":"10.1039/D4EM00338A","DOIUrl":null,"url":null,"abstract":"<p >Reductions in sulfur (S) atmospheric deposition in recent decades have been attributed to S deficiencies in crops. Similarly, global soil selenium (Se) concentrations were predicted to drop, particularly in Europe, due to increases in leaching attributed to increases in aridity. Given its international importance in agriculture, reductions of essential elements, including S and Se, in European soils could have important impacts on nutrition and human health. Our objectives were to model current soil S and Se levels in Europe and predict concentration changes for the 21st century. We interrogated four machine-learning (ML) techniques, but after critical evaluation, only outputs for linear support vector regression (Lin-SVR) models for S and Se and the multilayer perceptron model (MLP) for Se were consistent with known mechanisms reported in literature. Other models exhibited overfitting even when differences in training and testing performance were low or non-existent. Furthermore, our results highlight that similarly performing models based on RMSE or <em>R</em><small><sup>2</sup></small> can lead to drastically different predictions and conclusions, thus highlighting the need to interrogate machine learning models and to ensure they are consistent with known mechanisms reported in the literature. Both elements exhibited similar spatial patterns with predicted gains in Scandinavia <em>versus</em> losses in the central and Mediterranean regions of Europe, respectively, by the end of the 21st century for an extreme climate scenario. The median change was −5.5% for S (Lin-SVR) and −3.5% (MLP) and −4.0% (Lin-SVR) for Se. For both elements, modeled losses were driven by decreases in soil organic carbon, S and Se atmospheric deposition, and gains were driven by increases in evapotranspiration.</p>","PeriodicalId":74,"journal":{"name":"Environmental Science: Processes & Impacts","volume":" 9","pages":" 1503-1515"},"PeriodicalIF":4.3000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science: Processes & Impacts","FirstCategoryId":"93","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2024/em/d4em00338a","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Reductions in sulfur (S) atmospheric deposition in recent decades have been attributed to S deficiencies in crops. Similarly, global soil selenium (Se) concentrations were predicted to drop, particularly in Europe, due to increases in leaching attributed to increases in aridity. Given its international importance in agriculture, reductions of essential elements, including S and Se, in European soils could have important impacts on nutrition and human health. Our objectives were to model current soil S and Se levels in Europe and predict concentration changes for the 21st century. We interrogated four machine-learning (ML) techniques, but after critical evaluation, only outputs for linear support vector regression (Lin-SVR) models for S and Se and the multilayer perceptron model (MLP) for Se were consistent with known mechanisms reported in literature. Other models exhibited overfitting even when differences in training and testing performance were low or non-existent. Furthermore, our results highlight that similarly performing models based on RMSE or R2 can lead to drastically different predictions and conclusions, thus highlighting the need to interrogate machine learning models and to ensure they are consistent with known mechanisms reported in the literature. Both elements exhibited similar spatial patterns with predicted gains in Scandinavia versus losses in the central and Mediterranean regions of Europe, respectively, by the end of the 21st century for an extreme climate scenario. The median change was −5.5% for S (Lin-SVR) and −3.5% (MLP) and −4.0% (Lin-SVR) for Se. For both elements, modeled losses were driven by decreases in soil organic carbon, S and Se atmospheric deposition, and gains were driven by increases in evapotranspiration.
近几十年来,硫(S)在大气中的沉降量减少,原因是农作物缺乏硫。同样,由于干旱加剧导致沥滤增加,预计全球土壤硒(Se)浓度将下降,尤其是在欧洲。鉴于其在农业中的国际重要性,欧洲土壤中包括 S 和 Se 在内的必需元素的减少可能会对营养和人类健康产生重要影响。我们的目标是模拟欧洲目前的土壤 S 和 Se 含量,并预测 21 世纪的浓度变化。我们研究了四种机器学习(ML)技术,但经过严格评估,只有S和Se的线性支持向量回归(Lin-SVR)模型和Se的多层感知器模型(MLP)的输出结果与文献报道的已知机制一致。其他模型即使在训练和测试性能差异较小或不存在差异的情况下也表现出过拟合。此外,我们的研究结果突出表明,基于 RMSE 或 R2 的性能相似的模型可能会得出截然不同的预测和结论,因此强调了对机器学习模型进行检查并确保其与文献中报道的已知机制一致的必要性。这两种要素表现出相似的空间模式,在极端气候情景下,预测到 21 世纪末,斯堪的纳维亚半岛的气候将有所改善,而欧洲中部和地中海地区的气候将有所改善。S的变化中值为-5.5%(Lin-SVR),Se的变化中值为-3.5%(MLP)和-4.0%(Lin-SVR)。对于这两种元素,模型中的损失是由土壤有机碳、S 和 Se 大气沉积的减少所造成的,而增加则是由蒸散量的增加所造成的。
期刊介绍:
Environmental Science: Processes & Impacts publishes high quality papers in all areas of the environmental chemical sciences, including chemistry of the air, water, soil and sediment. We welcome studies on the environmental fate and effects of anthropogenic and naturally occurring contaminants, both chemical and microbiological, as well as related natural element cycling processes.