Identification of geographical origins of Gastrodia elata Blume based on multisource data fusion.

IF 3 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS
Phytochemical Analysis Pub Date : 2024-10-01 Epub Date: 2024-06-27 DOI:10.1002/pca.3413
Hong Liu, Honggao Liu, Jieqing Li, Yuanzhong Wang
{"title":"Identification of geographical origins of Gastrodia elata Blume based on multisource data fusion.","authors":"Hong Liu, Honggao Liu, Jieqing Li, Yuanzhong Wang","doi":"10.1002/pca.3413","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Identifying the geographical origin of Gastrodia elata Blume contributes to the scientific and rational utilization of medicinal materials. In this study, infrared spectroscopy was combined with machine learning algorithms to distinguish the origin of G. elata BI.</p><p><strong>Objective: </strong>Realization of rapid and accurate identification of the origin of G. elata BI.</p><p><strong>Materials and methods: </strong>Attenuated total reflection Fourier transform infrared (ATR-FTIR) spectra and Fourier transform near-infrared (FT-NIR) spectra were collected for 306 samples of G. elata BI.</p><p><strong>Samples: </strong>Firstly, a support vector machine (SVM) model was established based on the single-spectrum and the full-spectrum fusion data. To investigate whether feature-level fusion strategy can enhance the model's performance, the sequential and orthogonalized partial least squares discriminant analysis (SO-PLS-DA) model was established to extract and combine two types of spectral features. Next, six algorithms were employed to extract feature variables, SVM model was established based on the feature-level fusion data. To avoid complicated preprocessing and feature extraction processes, a residual convolutional neural network (ResNet) model was established after converting the raw spectral data into spectral images.</p><p><strong>Results: </strong>The accuracy of the feature-level fusion model is better as compared to the single-spectrum model and the fusion model with full-spectrum, and SO-PLS-DA is simpler than feature-level fusion based on the SVM model. The ResNet model performs well in classification but requires more data to enhance its generalization capability and training effectiveness.</p><p><strong>Conclusion: </strong>Sequential and orthogonalized data fusion approaches and ResNet models are powerful solutions for identifying the geographic origin of G. elata BI.</p>","PeriodicalId":20095,"journal":{"name":"Phytochemical Analysis","volume":null,"pages":null},"PeriodicalIF":3.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Phytochemical Analysis","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pca.3413","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/27 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Identifying the geographical origin of Gastrodia elata Blume contributes to the scientific and rational utilization of medicinal materials. In this study, infrared spectroscopy was combined with machine learning algorithms to distinguish the origin of G. elata BI.

Objective: Realization of rapid and accurate identification of the origin of G. elata BI.

Materials and methods: Attenuated total reflection Fourier transform infrared (ATR-FTIR) spectra and Fourier transform near-infrared (FT-NIR) spectra were collected for 306 samples of G. elata BI.

Samples: Firstly, a support vector machine (SVM) model was established based on the single-spectrum and the full-spectrum fusion data. To investigate whether feature-level fusion strategy can enhance the model's performance, the sequential and orthogonalized partial least squares discriminant analysis (SO-PLS-DA) model was established to extract and combine two types of spectral features. Next, six algorithms were employed to extract feature variables, SVM model was established based on the feature-level fusion data. To avoid complicated preprocessing and feature extraction processes, a residual convolutional neural network (ResNet) model was established after converting the raw spectral data into spectral images.

Results: The accuracy of the feature-level fusion model is better as compared to the single-spectrum model and the fusion model with full-spectrum, and SO-PLS-DA is simpler than feature-level fusion based on the SVM model. The ResNet model performs well in classification but requires more data to enhance its generalization capability and training effectiveness.

Conclusion: Sequential and orthogonalized data fusion approaches and ResNet models are powerful solutions for identifying the geographic origin of G. elata BI.

基于多源数据融合的 Gastrodia elata Blume 地理起源识别。
导言:确定 Gastrodia elata Blume 的地理产地有助于科学合理地利用药材。本研究将红外光谱法与机器学习算法相结合,以区分 G. elata BI 的产地:材料与方法:收集了 306 个 G. elata BI 样品的衰减全反射傅立叶变换红外光谱(ATR-FTIR)和傅立叶变换近红外光谱(FT-NIR):首先,基于单光谱和全光谱融合数据建立支持向量机(SVM)模型。为了研究特征级融合策略是否能提高模型的性能,建立了序列和正交化偏最小二乘判别分析(SO-PLS-DA)模型来提取和组合两种光谱特征。接着,采用六种算法提取特征变量,并根据特征级融合数据建立 SVM 模型。为了避免复杂的预处理和特征提取过程,在将原始光谱数据转换为光谱图像后,建立了残差卷积神经网络(ResNet)模型:结果:与单光谱模型和全光谱融合模型相比,特征级融合模型的准确度更高,SO-PLS-DA 比基于 SVM 模型的特征级融合更简单。ResNet 模型在分类中表现良好,但需要更多的数据来增强其泛化能力和训练效果:结论:序列和正交化数据融合方法以及 ResNet 模型是识别 G. elata BI 地理起源的有力解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Phytochemical Analysis
Phytochemical Analysis 生物-分析化学
CiteScore
6.00
自引率
6.10%
发文量
88
审稿时长
1.7 months
期刊介绍: Phytochemical Analysis is devoted to the publication of original articles concerning the development, improvement, validation and/or extension of application of analytical methodology in the plant sciences. The spectrum of coverage is broad, encompassing methods and techniques relevant to the detection (including bio-screening), extraction, separation, purification, identification and quantification of compounds in plant biochemistry, plant cellular and molecular biology, plant biotechnology, the food sciences, agriculture and horticulture. The Journal publishes papers describing significant novelty in the analysis of whole plants (including algae), plant cells, tissues and organs, plant-derived extracts and plant products (including those which have been partially or completely refined for use in the food, agrochemical, pharmaceutical and related industries). All forms of physical, chemical, biochemical, spectroscopic, radiometric, electrometric, chromatographic, metabolomic and chemometric investigations of plant products (monomeric species as well as polymeric molecules such as nucleic acids, proteins, lipids and carbohydrates) are included within the remit of the Journal. Papers dealing with novel methods relating to areas such as data handling/ data mining in plant sciences will also be welcomed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信