Estimating baselines of Raman spectra based on transformer and manually annotated data.

Jiangsan Zhao, Tomasz Woznicki, Krzysztof Kusnierek
{"title":"Estimating baselines of Raman spectra based on transformer and manually annotated data.","authors":"Jiangsan Zhao, Tomasz Woznicki, Krzysztof Kusnierek","doi":"10.1016/j.saa.2024.125679","DOIUrl":null,"url":null,"abstract":"<p><p>Raman spectroscopy is a powerful and non-invasive analytical method for determining the chemical composition and molecular structure of a wide range of materials, including complex biological tissues. However, the captured signals typically suffer from interferences manifested as noise and baseline, which need to be removed for successful data analysis. Effective baseline correction is critical in quantitative analysis, as it may impact peak signature derivation. Current baseline correction methods can be labor-intensive and may require extensive parameter adjustment depending on the input spectrum characteristics. In contrast, deep learning-based baseline correction models trained across various materials, offer a promising and more versatile alternative. This study reports an approach to manually identify the ground-truth baselines for eight different biological materials through extensively tuning the parameters of three classical baseline correction methods, Modified Multi-Polynomial Fit (Modpoly), Improved Modified Multi-Polynomial Fitting (IModpoly), and Adaptive Iteratively Reweighted Penalized Least Squares (airPLS), and combining the outputs to best fit the training data. We designed a one-dimensional Transformer (1dTrans) tailored to fit Raman spectral data for estimating their baselines, and evaluated its performance against convolutional neural network (CNN), ResUNet, and three aforementioned parametric methods. The 1dTrans model achieved lower mean absolute error (MAE) and spectral angle mapper (SAM) scores when compared to the other methods in both development and evaluation of the manually labeled original raw Raman spectra, highlighting the effectiveness of the method in Raman spectra pre-processing.</p>","PeriodicalId":94213,"journal":{"name":"Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy","volume":"330 ","pages":"125679"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.saa.2024.125679","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Raman spectroscopy is a powerful and non-invasive analytical method for determining the chemical composition and molecular structure of a wide range of materials, including complex biological tissues. However, the captured signals typically suffer from interferences manifested as noise and baseline, which need to be removed for successful data analysis. Effective baseline correction is critical in quantitative analysis, as it may impact peak signature derivation. Current baseline correction methods can be labor-intensive and may require extensive parameter adjustment depending on the input spectrum characteristics. In contrast, deep learning-based baseline correction models trained across various materials, offer a promising and more versatile alternative. This study reports an approach to manually identify the ground-truth baselines for eight different biological materials through extensively tuning the parameters of three classical baseline correction methods, Modified Multi-Polynomial Fit (Modpoly), Improved Modified Multi-Polynomial Fitting (IModpoly), and Adaptive Iteratively Reweighted Penalized Least Squares (airPLS), and combining the outputs to best fit the training data. We designed a one-dimensional Transformer (1dTrans) tailored to fit Raman spectral data for estimating their baselines, and evaluated its performance against convolutional neural network (CNN), ResUNet, and three aforementioned parametric methods. The 1dTrans model achieved lower mean absolute error (MAE) and spectral angle mapper (SAM) scores when compared to the other methods in both development and evaluation of the manually labeled original raw Raman spectra, highlighting the effectiveness of the method in Raman spectra pre-processing.

求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信