Comparing baseline correction algorithms in discriminating brownish soils from five proximity locations based on UPLC and PLS-DA methods

IF 1.4 4区 医学 Q3 MEDICINE, LEGAL
Muhamad Adib bin Ahmad, Loong Chuen Lee, Nur  Ain Najihah Binti Mohd Rosdi, Nadirah Binti Abd Hamid, A. Ishak, Hukil Sino
{"title":"Comparing baseline correction algorithms in discriminating brownish soils from five proximity locations based on UPLC and PLS-DA methods","authors":"Muhamad Adib bin Ahmad, Loong Chuen Lee, Nur  Ain Najihah Binti Mohd Rosdi, Nadirah Binti Abd Hamid, A. Ishak, Hukil Sino","doi":"10.1093/fsr/owad045","DOIUrl":null,"url":null,"abstract":"\n \n \n Soil is commonly collected from an outdoor crime scene, and thus it is helpful in linking a suspect and a victim to a crime scene. The chemical profiles of soils can be acquired via chemical instruments such as Ultra-Performance Liquid Chromatography (UPLC). However, the UPLC chromatogram often interferes with an unstable baseline. In this paper, we compared the performance of five baseline correction (BC) algorithms, i.e., asymmetric least squares, fill peak (FP), iterative restricted least squares, median window (MW), and modified polynomial fitting, in discriminating 30 chromatograms of brownish soils by five locations of origin, i.e., PP, HK, KU, BL and KB. The performances of the preprocessed sub-datasets were first visually inspected through the mean chromatograms and then further explored via scores plots of principal component analysis. Eventually, the predictive performances of the PLS-DA models estimated from 1000 pairs of training and testing samples (i.e., prepared via iterative random resampling split at 75:25) were studied to identify the best BC method. Mean raw chromatograms of the ten soil samples were different from each other, with evident fluctuated baselines. AsLS and MW corrected chromatograms demonstrated the most significant improvement compared to the raw counterpart. Meanwhile, the scores plot of PCA revealed that most of the sub-datasets produced three separate clusters. Then, the sub-datasets were modelled via the partial least squares-discriminant analysis (PLS-DA) technique. MW emerged as the excellent BC method based on the mean prediction accuracy estimated using 1000 pairs of training and testing samples. In conclusion, MW outperformed the other BC methods in correcting the UPLC data of soil.\n \n \n \n","PeriodicalId":45852,"journal":{"name":"Forensic Sciences Research","volume":" 30","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forensic Sciences Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/fsr/owad045","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, LEGAL","Score":null,"Total":0}
引用次数: 0

Abstract

Soil is commonly collected from an outdoor crime scene, and thus it is helpful in linking a suspect and a victim to a crime scene. The chemical profiles of soils can be acquired via chemical instruments such as Ultra-Performance Liquid Chromatography (UPLC). However, the UPLC chromatogram often interferes with an unstable baseline. In this paper, we compared the performance of five baseline correction (BC) algorithms, i.e., asymmetric least squares, fill peak (FP), iterative restricted least squares, median window (MW), and modified polynomial fitting, in discriminating 30 chromatograms of brownish soils by five locations of origin, i.e., PP, HK, KU, BL and KB. The performances of the preprocessed sub-datasets were first visually inspected through the mean chromatograms and then further explored via scores plots of principal component analysis. Eventually, the predictive performances of the PLS-DA models estimated from 1000 pairs of training and testing samples (i.e., prepared via iterative random resampling split at 75:25) were studied to identify the best BC method. Mean raw chromatograms of the ten soil samples were different from each other, with evident fluctuated baselines. AsLS and MW corrected chromatograms demonstrated the most significant improvement compared to the raw counterpart. Meanwhile, the scores plot of PCA revealed that most of the sub-datasets produced three separate clusters. Then, the sub-datasets were modelled via the partial least squares-discriminant analysis (PLS-DA) technique. MW emerged as the excellent BC method based on the mean prediction accuracy estimated using 1000 pairs of training and testing samples. In conclusion, MW outperformed the other BC methods in correcting the UPLC data of soil.
比较基于 UPLC 和 PLS-DA 方法的基线校正算法在鉴别五处近距离棕壤中的应用
土壤通常是从室外犯罪现场采集的,因此有助于将嫌疑人和受害者与犯罪现场联系起来。可以通过超高效液相色谱(UPLC)等化学仪器获取土壤的化学特征。然而,超高效液相色谱法的色谱图往往会受到不稳定基线的干扰。本文比较了非对称最小二乘法、填充峰(FP)、迭代限制最小二乘法、中值窗(MW)和修正多项式拟合等五种基线校正(BC)算法在按五个产地(即 PP、HK、KU、BL 和 KB)判别 30 幅棕壤色谱图时的性能。预处理后的子数据集的性能首先通过平均色谱图进行直观检查,然后通过主成分分析的得分图进行进一步探讨。最后,研究了从 1000 对训练样本和测试样本(即通过迭代随机重样法按 75:25 的比例分割制备的样本)估算出的 PLS-DA 模型的预测性能,以确定最佳的 BC 方法。10 个土壤样品的平均原始色谱图彼此不同,基线波动明显。与原始色谱图相比,AsLS 和 MW 校正色谱图的改进最为显著。同时,PCA 的得分图显示,大多数子数据集都产生了三个独立的聚类。然后,通过偏最小二乘判别分析(PLS-DA)技术对子数据集进行建模。根据使用 1000 对训练和测试样本估算的平均预测准确率,MW 成为优秀的 BC 方法。总之,MW 在校正土壤 UPLC 数据方面的表现优于其他 BC 方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Forensic Sciences Research
Forensic Sciences Research MEDICINE, LEGAL-
CiteScore
3.60
自引率
7.70%
发文量
158
审稿时长
26 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信