Using visible and near infrared spectroscopy and machine learning for estimating total petroleum hydrocarbons in contaminated soils

IF 1.6 4区 化学 Q3 CHEMISTRY, APPLIED
Fereshteh Karimian, Shamsollah Ayoubi, Banafsheh Khalili, Seyed Ahmad Mireei, Yaseen Al-Mulla
{"title":"Using visible and near infrared spectroscopy and machine learning for estimating total petroleum hydrocarbons in contaminated soils","authors":"Fereshteh Karimian, Shamsollah Ayoubi, Banafsheh Khalili, Seyed Ahmad Mireei, Yaseen Al-Mulla","doi":"10.1177/09670335241269168","DOIUrl":null,"url":null,"abstract":"Petroleum pollution in soil is very damaging to the areas affected by the accidental release of petroleum hydrocarbons and has destructive impacts on natural resources and environmental health. Therefore, its monitoring and analysis are critical, however, due to the cost and time associated with chemical approaches, finding a quick and cost-effective analytical method is valuable. This study was conducted to evaluate the potential of using visible near infrared (Vis-NIR) spectroscopy to predict total petroleum hydrocarbons (TPH) in polluted soils around the Shadegan ponds, in southern Iran. One hundred soil samples showing various degrees of pollution were randomly collected from topsoil (0–10 cm). The soil samples were analyzed for TPH using Vis-NIR reflectance spectroscopy in the laboratory and then following application of preprocessing transformation, partial least squares PLS regression as well as two machine learning models including random forest (RF), and support vector machine (SVM) were examined. The results showed that the reflectance values at 1725 nm and 2311 nm, respectively, served as indicative TPH reflectance features, exhibiting weaker reflection with rising TPH. Among the preprocessing methods, the baseline correction method indicated the highest performance than the others. According to the evaluation model criteria in the validation dataset, the efficiency of the three selected models was observed in the following order SVM &gt; RF &gt; PLS regression. The SVM model provided the best performance in the validation dataset with r<jats:sup>2</jats:sup> = 0.85, root mean of square (RMSEP = 1.59 %, and the ratio of prediction to deviation (RPD) = 2.6. Overall, this study provided strong evidence supporting the considerable potential of Visible-NIR spectroscopy as a rapid and cost-effective technique for estimating TPH levels in oil-contaminated soils, surpassing traditional chemical analytical methods. Applying the mid-infrared spectrum (MIR) in combination with Visible-NIR data is expected to provide more comprehensive and accurate results when assessing soils in polluted areas, thereby enhancing the accuracy and reliability of the results across a diverse range of soil types.","PeriodicalId":16551,"journal":{"name":"Journal of Near Infrared Spectroscopy","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Near Infrared Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1177/09670335241269168","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

Petroleum pollution in soil is very damaging to the areas affected by the accidental release of petroleum hydrocarbons and has destructive impacts on natural resources and environmental health. Therefore, its monitoring and analysis are critical, however, due to the cost and time associated with chemical approaches, finding a quick and cost-effective analytical method is valuable. This study was conducted to evaluate the potential of using visible near infrared (Vis-NIR) spectroscopy to predict total petroleum hydrocarbons (TPH) in polluted soils around the Shadegan ponds, in southern Iran. One hundred soil samples showing various degrees of pollution were randomly collected from topsoil (0–10 cm). The soil samples were analyzed for TPH using Vis-NIR reflectance spectroscopy in the laboratory and then following application of preprocessing transformation, partial least squares PLS regression as well as two machine learning models including random forest (RF), and support vector machine (SVM) were examined. The results showed that the reflectance values at 1725 nm and 2311 nm, respectively, served as indicative TPH reflectance features, exhibiting weaker reflection with rising TPH. Among the preprocessing methods, the baseline correction method indicated the highest performance than the others. According to the evaluation model criteria in the validation dataset, the efficiency of the three selected models was observed in the following order SVM > RF > PLS regression. The SVM model provided the best performance in the validation dataset with r2 = 0.85, root mean of square (RMSEP = 1.59 %, and the ratio of prediction to deviation (RPD) = 2.6. Overall, this study provided strong evidence supporting the considerable potential of Visible-NIR spectroscopy as a rapid and cost-effective technique for estimating TPH levels in oil-contaminated soils, surpassing traditional chemical analytical methods. Applying the mid-infrared spectrum (MIR) in combination with Visible-NIR data is expected to provide more comprehensive and accurate results when assessing soils in polluted areas, thereby enhancing the accuracy and reliability of the results across a diverse range of soil types.
利用可见光和近红外光谱以及机器学习估算受污染土壤中的石油碳氢化合物总量
土壤中的石油污染对受石油碳氢化合物意外释放影响的地区危害极大,并对自然资源和环境健康造成破坏性影响。因此,对其进行监测和分析至关重要,然而,由于化学方法的成本和时间,找到一种快速、经济有效的分析方法非常重要。本研究旨在评估使用可见近红外(Vis-NIR)光谱预测伊朗南部 Shadegan 池塘周围受污染土壤中总石油碳氢化合物 (TPH) 的潜力。从表层土(0-10 厘米)中随机采集了 100 个不同污染程度的土壤样本。在实验室使用可见光-近红外反射光谱法对土壤样本进行了 TPH 分析,然后在应用预处理转换后,对偏最小二乘法 PLS 回归以及两种机器学习模型(包括随机森林 (RF) 和支持向量机 (SVM))进行了检验。结果表明,1725 nm 和 2311 nm 处的反射率值分别可作为指示性 TPH 反射率特征,随着 TPH 的升高,反射率会减弱。在各种预处理方法中,基线校正法的性能最高。根据验证数据集的评价模型标准,所选三个模型的效率依次为 SVM >;RF >;PLS 回归。SVM 模型在验证数据集中表现最佳,r2 = 0.85,均方根(RMSEP)= 1.59 %,预测与偏差比(RPD)= 2.6。总之,这项研究提供了有力的证据,证明可见光-近红外光谱作为一种快速、经济高效的技术,在估算油类污染土壤中的 TPH 含量方面具有巨大的潜力,超过了传统的化学分析方法。将中红外光谱 (MIR) 与可见光-近红外数据结合使用,有望在评估受污染地区的土壤时提供更全面、更准确的结果,从而提高各种土壤类型结果的准确性和可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.30
自引率
5.60%
发文量
35
审稿时长
6 months
期刊介绍: JNIRS — Journal of Near Infrared Spectroscopy is a peer reviewed journal, publishing original research papers, short communications, review articles and letters concerned with near infrared spectroscopy and technology, its application, new instrumentation and the use of chemometric and data handling techniques within NIR.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信