Identification of Monteverdia ilicifolia by fourier-transform mid-infrared spectroscopy associated with chemometrics and machine learning

IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS
Ahmad Kassem El Zein , Alexandre de Fátima Cobre , Raul Edison Luna Lazo , Kevin Alves Antunes , Jane Manfron , Luana Mota Ferreira , Roberto Pontarolo
{"title":"Identification of Monteverdia ilicifolia by fourier-transform mid-infrared spectroscopy associated with chemometrics and machine learning","authors":"Ahmad Kassem El Zein ,&nbsp;Alexandre de Fátima Cobre ,&nbsp;Raul Edison Luna Lazo ,&nbsp;Kevin Alves Antunes ,&nbsp;Jane Manfron ,&nbsp;Luana Mota Ferreira ,&nbsp;Roberto Pontarolo","doi":"10.1016/j.chemolab.2025.105420","DOIUrl":null,"url":null,"abstract":"<div><div><em>Monteverdia ilicifolia</em> (Mart. Ex Reissek) Biral, a member of the Celastraceae botanical family, is widely recognized for its broad-spectrum therapeutic effects in South America, particularly in Brazil, where it is commonly referred as “espinheira-santa”. This study aimed to develop a chemometric and machine learning-based method for to accurately identify and differentiate <em>M. ilicifolia</em> from morphologically similar species used as adulterants. Fourier transform mid-infrared spectrometry (MIR-FTIR) was used to analyze leaves (n = 6 species, 3000 spectra), powders (n = 6 species, 3000 spectra) and extracts samples (n = 6 species, 600 spectra). The spectral datasets were predicted by Principal Component Analysis (PCA) and Partial Least Squares Discriminant Analysis (PLS-DA). The PLS-DA model was challenged with samples of other common plant species (n = 3) and commercial available <em>M. ilicifolia</em> (n = 10) to evaluate its predictive capability. PCA successfully distinguished between the plant species. PLS-DA achieved superior performance with extract samples, exhibiting sensitivity, specificity and accuracy of 94, 100 and 99 %, respectively. Machine learning algorithms were developed to better represent the leaves and powder samples through Random Forest and 10-fold validation methodology. The model yielded high accuracy in all sample types, with low false positive rate and excellent performance across the metrics of accuracy, recall, precision, F1 Score, Kappa index and Matthews Correlation Coefficient (MCC). PCA and PLS-DA models presented limitations over the complexity of leaves and powders samples. Machine learning algorithms showed robustness and flexibility, proving to be effective in the detection and discrimination of <em>M. ilicifolia</em>.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105420"},"PeriodicalIF":3.7000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743925001054","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Monteverdia ilicifolia (Mart. Ex Reissek) Biral, a member of the Celastraceae botanical family, is widely recognized for its broad-spectrum therapeutic effects in South America, particularly in Brazil, where it is commonly referred as “espinheira-santa”. This study aimed to develop a chemometric and machine learning-based method for to accurately identify and differentiate M. ilicifolia from morphologically similar species used as adulterants. Fourier transform mid-infrared spectrometry (MIR-FTIR) was used to analyze leaves (n = 6 species, 3000 spectra), powders (n = 6 species, 3000 spectra) and extracts samples (n = 6 species, 600 spectra). The spectral datasets were predicted by Principal Component Analysis (PCA) and Partial Least Squares Discriminant Analysis (PLS-DA). The PLS-DA model was challenged with samples of other common plant species (n = 3) and commercial available M. ilicifolia (n = 10) to evaluate its predictive capability. PCA successfully distinguished between the plant species. PLS-DA achieved superior performance with extract samples, exhibiting sensitivity, specificity and accuracy of 94, 100 and 99 %, respectively. Machine learning algorithms were developed to better represent the leaves and powder samples through Random Forest and 10-fold validation methodology. The model yielded high accuracy in all sample types, with low false positive rate and excellent performance across the metrics of accuracy, recall, precision, F1 Score, Kappa index and Matthews Correlation Coefficient (MCC). PCA and PLS-DA models presented limitations over the complexity of leaves and powders samples. Machine learning algorithms showed robustness and flexibility, proving to be effective in the detection and discrimination of M. ilicifolia.
化学计量学与机器学习相结合的傅里叶变换中红外光谱技术鉴定蒙太子
蒙特维迪亚(蒙特维迪亚)比拉尔(Ex Reissek)是Celastraceae植物家族的一员,在南美洲,特别是在巴西,它被广泛认为具有广谱的治疗作用,在那里它通常被称为“espinheira-santa”。本研究旨在建立一种基于化学计量学和机器学习的方法,以准确识别和区分作为掺假剂的黄连叶和形态相似的物种。采用傅里叶变换中红外光谱法(MIR-FTIR)对叶片(n = 6种,3000个光谱)、粉末(n = 6种,3000个光谱)和提取物样品(n = 6种,600个光谱)进行分析。利用主成分分析(PCA)和偏最小二乘判别分析(PLS-DA)对光谱数据集进行预测。利用其他常见植物物种(n = 3)和市售黄杨(n = 10)的样本对PLS-DA模型进行了挑战,以评估其预测能力。PCA成功地区分了植物种类。PLS-DA在提取样品中表现优异,灵敏度、特异度和准确度分别为94%、100%和99%。通过随机森林和10倍验证方法,开发了机器学习算法来更好地表示叶子和粉末样本。该模型在所有样本类型中均具有较高的准确率,假阳性率低,在准确率、查全率、查准率、F1评分、Kappa指数和Matthews相关系数(MCC)等指标上表现优异。PCA和PLS-DA模型在叶片和粉末样品的复杂性方面存在局限性。机器学习算法表现出鲁棒性和灵活性,证明了对黄芪的检测和识别是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.50
自引率
7.70%
发文量
169
审稿时长
3.4 months
期刊介绍: Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines. Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data. The journal deals with the following topics: 1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.) 2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered. 3) Development of new software that provides novel tools or truly advances the use of chemometrical methods. 4) Well characterized data sets to test performance for the new methods and software. The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信