Enhancing elemental quantification in LIBS with SHAP-guided emission line analysis: A soil carbon study

IF 3.2 2区 化学 Q1 SPECTROSCOPY
Davi Keglevich Neiva , Wesley Nascimento Guedes , Ladislau Martin-Neto , Paulino Ribeiro Villas-Boas
{"title":"Enhancing elemental quantification in LIBS with SHAP-guided emission line analysis: A soil carbon study","authors":"Davi Keglevich Neiva ,&nbsp;Wesley Nascimento Guedes ,&nbsp;Ladislau Martin-Neto ,&nbsp;Paulino Ribeiro Villas-Boas","doi":"10.1016/j.sab.2024.106971","DOIUrl":null,"url":null,"abstract":"<div><p>In laser-induced breakdown spectroscopy (LIBS), identifying key emission lines for accurate elemental quantification has long posed a challenge. Traditional methods rely on experimental knowledge, atomic databases, and intricate spectral analyses. Although machine learning techniques – such as boosting algorithms and neural networks – offer efficient processing for large datasets, the complexity of these techniques often compromises interpretability. To address this issue, our study integrates the SHapley Additive exPlanations (SHAP) algorithm with gradient boosting models in order to interpret the most important spectral features, thus enhancing our understanding of how specific emission lines contribute to the carbon (C) concentration predictions in soils. Deployed on a large dataset of 1019 soil samples, a wrapper method with a random forest regressor reduced the initial spectral intensity features from 13,748 to 1098. Subsequent application of a LightGBM regression model calibrated via the Optuna framework yielded – for training and validation sets, respectively – an <span><math><msup><mi>R</mi><mn>2</mn></msup></math></span> of 0.98 and 0.77, and RMSE values of 1.55 and 4.54 g kg<sup>−1</sup>. The SHAP summary plot showed that C emission lines influenced the model's predictions positively, as anticipated, whereas silicon (Si) emission lines produced a negative impact, suggesting a lower C concentration in sandy soils. Our findings not only validate the efficacy of SHAP in improving LIBS-based soil C quantification, but they also offer a sophisticated framework for decoding the complex interplay between emission lines and target elemental concentrations.</p></div>","PeriodicalId":21890,"journal":{"name":"Spectrochimica Acta Part B: Atomic Spectroscopy","volume":"217 ","pages":"Article 106971"},"PeriodicalIF":3.2000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spectrochimica Acta Part B: Atomic Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0584854724001150","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SPECTROSCOPY","Score":null,"Total":0}
引用次数: 0

Abstract

In laser-induced breakdown spectroscopy (LIBS), identifying key emission lines for accurate elemental quantification has long posed a challenge. Traditional methods rely on experimental knowledge, atomic databases, and intricate spectral analyses. Although machine learning techniques – such as boosting algorithms and neural networks – offer efficient processing for large datasets, the complexity of these techniques often compromises interpretability. To address this issue, our study integrates the SHapley Additive exPlanations (SHAP) algorithm with gradient boosting models in order to interpret the most important spectral features, thus enhancing our understanding of how specific emission lines contribute to the carbon (C) concentration predictions in soils. Deployed on a large dataset of 1019 soil samples, a wrapper method with a random forest regressor reduced the initial spectral intensity features from 13,748 to 1098. Subsequent application of a LightGBM regression model calibrated via the Optuna framework yielded – for training and validation sets, respectively – an R2 of 0.98 and 0.77, and RMSE values of 1.55 and 4.54 g kg−1. The SHAP summary plot showed that C emission lines influenced the model's predictions positively, as anticipated, whereas silicon (Si) emission lines produced a negative impact, suggesting a lower C concentration in sandy soils. Our findings not only validate the efficacy of SHAP in improving LIBS-based soil C quantification, but they also offer a sophisticated framework for decoding the complex interplay between emission lines and target elemental concentrations.

Abstract Image

利用 SHAP 引导的发射线分析增强 LIBS 中的元素定量:土壤碳研究
在激光诱导击穿光谱(LIBS)中,确定关键发射线以进行精确的元素定量一直是个难题。传统方法依赖于实验知识、原子数据库和复杂的光谱分析。尽管机器学习技术--如提升算法和神经网络--可对大型数据集进行高效处理,但这些技术的复杂性往往会影响可解释性。为了解决这个问题,我们的研究将 SHapley Additive exPlanations(SHAP)算法与梯度提升模型相结合,以解释最重要的光谱特征,从而加深我们对特定发射线如何有助于预测土壤中碳(C)浓度的理解。在一个包含 1019 个土壤样本的大型数据集上,采用了一种带有随机森林回归器的包装方法,将初始光谱强度特征从 13748 个减少到 1098 个。随后,通过 Optuna 框架校准的 LightGBM 回归模型在训练集和验证集上的 R2 值分别为 0.98 和 0.77,RMSE 值分别为 1.55 和 4.54 g kg-1。SHAP 汇总图显示,正如预期的那样,碳排放线对模型的预测产生了积极影响,而硅(Si)排放线则产生了消极影响,这表明沙质土壤中的碳浓度较低。我们的研究结果不仅验证了 SHAP 在改进基于 LIBS 的土壤碳定量方面的功效,还为解码发射线与目标元素浓度之间复杂的相互作用提供了一个复杂的框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.10
自引率
12.10%
发文量
173
审稿时长
81 days
期刊介绍: Spectrochimica Acta Part B: Atomic Spectroscopy, is intended for the rapid publication of both original work and reviews in the following fields: Atomic Emission (AES), Atomic Absorption (AAS) and Atomic Fluorescence (AFS) spectroscopy; Mass Spectrometry (MS) for inorganic analysis covering Spark Source (SS-MS), Inductively Coupled Plasma (ICP-MS), Glow Discharge (GD-MS), and Secondary Ion Mass Spectrometry (SIMS). Laser induced atomic spectroscopy for inorganic analysis, including non-linear optical laser spectroscopy, covering Laser Enhanced Ionization (LEI), Laser Induced Fluorescence (LIF), Resonance Ionization Spectroscopy (RIS) and Resonance Ionization Mass Spectrometry (RIMS); Laser Induced Breakdown Spectroscopy (LIBS); Cavity Ringdown Spectroscopy (CRDS), Laser Ablation Inductively Coupled Plasma Atomic Emission Spectroscopy (LA-ICP-AES) and Laser Ablation Inductively Coupled Plasma Mass Spectrometry (LA-ICP-MS). X-ray spectrometry, X-ray Optics and Microanalysis, including X-ray fluorescence spectrometry (XRF) and related techniques, in particular Total-reflection X-ray Fluorescence Spectrometry (TXRF), and Synchrotron Radiation-excited Total reflection XRF (SR-TXRF). Manuscripts dealing with (i) fundamentals, (ii) methodology development, (iii)instrumentation, and (iv) applications, can be submitted for publication.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信