利用融合了 Wilks Λ 统计量和 FDA 的 PCA 提取荧光激发-发射矩阵的特征,用于甜罗勒的原产地识别和有效成分含量预测

IF 2.9 3区 农林科学 Q2 FOOD SCIENCE & TECHNOLOGY
Wenfei Du, Yong Yin, Hao Wu, Yunxia Yuan, Junliang Chen, Yunfeng Xu, Huichun Yu
{"title":"利用融合了 Wilks Λ 统计量和 FDA 的 PCA 提取荧光激发-发射矩阵的特征,用于甜罗勒的原产地识别和有效成分含量预测","authors":"Wenfei Du,&nbsp;Yong Yin,&nbsp;Hao Wu,&nbsp;Yunxia Yuan,&nbsp;Junliang Chen,&nbsp;Yunfeng Xu,&nbsp;Huichun Yu","doi":"10.1007/s11694-024-02935-7","DOIUrl":null,"url":null,"abstract":"<div><p>Sweet basil is a commonly used food spice and traditional medicine in China, geographical differences have a significant impact on the content of active ingredients of sweet basil. In this study, a feature extraction strategy of fluorescence data using principal component analysis (PCA) fused with Wilks Λ-statistic and fisher discriminant analysis (FDA) was proposed for rapid discrimination and quantitative detection of sweet basil from different origins. After the pretreatment of the fluorescence excitation-emission matrices, 8 feature emission wavelengths were extracted using PCA combined Wilks Λ-statistic, and subsequently fluorescence excitation-emission matrices corresponding to the feature emission wavelengths was fused by FDA, and the first three FD variables with a cumulative discriminant power of 99% were selected as feature vectors. Finally, the extreme learning machine (ELM) and random forest (RF) models were constructed for the sweet basil origin identification, and the back propagation neural network (BPNN) algorithm was employed for the rapid prediction of linalool and flavonoids in sweet basil. The results showed that compared with the RF model, the ELM model was more suitable for identifying sweet basil from different sources, with an accuracy rate of 98%. The coefficient of determination (R<sup>2</sup>) and root mean square error (RMSE) of the linalool content prediction model based on BPNN were 0.984 and 0.131, respectively. The R<sup>2</sup> and RMSE of the BPNN flavonoids content prediction model based on BPNN were 0.969 and 0.019, respectively. The above results indicated that the suggested feature extraction method showed good generalization ability and robustness, which provides an alternative feature selection method for the rapid identification of the food source and the evaluation of food quality.</p></div>","PeriodicalId":631,"journal":{"name":"Journal of Food Measurement and Characterization","volume":"18 12","pages":"9971 - 9982"},"PeriodicalIF":2.9000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature extraction of fluorescence excitation-emission matrices using PCA fused with Wilks Λ-statistic and FDA for origin identification and active components content prediction of sweet basil\",\"authors\":\"Wenfei Du,&nbsp;Yong Yin,&nbsp;Hao Wu,&nbsp;Yunxia Yuan,&nbsp;Junliang Chen,&nbsp;Yunfeng Xu,&nbsp;Huichun Yu\",\"doi\":\"10.1007/s11694-024-02935-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Sweet basil is a commonly used food spice and traditional medicine in China, geographical differences have a significant impact on the content of active ingredients of sweet basil. In this study, a feature extraction strategy of fluorescence data using principal component analysis (PCA) fused with Wilks Λ-statistic and fisher discriminant analysis (FDA) was proposed for rapid discrimination and quantitative detection of sweet basil from different origins. After the pretreatment of the fluorescence excitation-emission matrices, 8 feature emission wavelengths were extracted using PCA combined Wilks Λ-statistic, and subsequently fluorescence excitation-emission matrices corresponding to the feature emission wavelengths was fused by FDA, and the first three FD variables with a cumulative discriminant power of 99% were selected as feature vectors. Finally, the extreme learning machine (ELM) and random forest (RF) models were constructed for the sweet basil origin identification, and the back propagation neural network (BPNN) algorithm was employed for the rapid prediction of linalool and flavonoids in sweet basil. The results showed that compared with the RF model, the ELM model was more suitable for identifying sweet basil from different sources, with an accuracy rate of 98%. The coefficient of determination (R<sup>2</sup>) and root mean square error (RMSE) of the linalool content prediction model based on BPNN were 0.984 and 0.131, respectively. The R<sup>2</sup> and RMSE of the BPNN flavonoids content prediction model based on BPNN were 0.969 and 0.019, respectively. The above results indicated that the suggested feature extraction method showed good generalization ability and robustness, which provides an alternative feature selection method for the rapid identification of the food source and the evaluation of food quality.</p></div>\",\"PeriodicalId\":631,\"journal\":{\"name\":\"Journal of Food Measurement and Characterization\",\"volume\":\"18 12\",\"pages\":\"9971 - 9982\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Food Measurement and Characterization\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s11694-024-02935-7\",\"RegionNum\":3,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"FOOD SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Food Measurement and Characterization","FirstCategoryId":"97","ListUrlMain":"https://link.springer.com/article/10.1007/s11694-024-02935-7","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

甜罗勒是中国常用的食用香料和传统药材,地域差异对甜罗勒有效成分的含量有显著影响。本研究提出了一种利用主成分分析(PCA)融合Wilks Λ统计量和Fisher判别分析(FDA)的荧光数据特征提取策略,用于不同产地甜罗勒的快速判别和定量检测。在对荧光激发-发射矩阵进行预处理后,利用 PCA 结合 Wilks Λ 统计量提取 8 个特征发射波长,然后利用 FDA 对特征发射波长对应的荧光激发-发射矩阵进行融合,并选择累计判别力达到 99% 的前三个 FD 变量作为特征向量。最后,构建了用于甜罗勒产地鉴定的极端学习机(ELM)和随机森林(RF)模型,并采用反向传播神经网络(BPNN)算法对甜罗勒中的芳樟醇和类黄酮进行了快速预测。结果表明,与 RF 模型相比,ELM 模型更适合识别不同来源的甜罗勒,准确率达 98%。基于 BPNN 的芳樟醇含量预测模型的判定系数(R2)和均方根误差(RMSE)分别为 0.984 和 0.131。基于 BPNN 的类黄酮含量预测模型的 R2 和均方根误差分别为 0.969 和 0.019。上述结果表明,所建议的特征提取方法具有良好的泛化能力和鲁棒性,为快速识别食物来源和评价食物质量提供了一种可供选择的特征选择方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Feature extraction of fluorescence excitation-emission matrices using PCA fused with Wilks Λ-statistic and FDA for origin identification and active components content prediction of sweet basil

Feature extraction of fluorescence excitation-emission matrices using PCA fused with Wilks Λ-statistic and FDA for origin identification and active components content prediction of sweet basil

Sweet basil is a commonly used food spice and traditional medicine in China, geographical differences have a significant impact on the content of active ingredients of sweet basil. In this study, a feature extraction strategy of fluorescence data using principal component analysis (PCA) fused with Wilks Λ-statistic and fisher discriminant analysis (FDA) was proposed for rapid discrimination and quantitative detection of sweet basil from different origins. After the pretreatment of the fluorescence excitation-emission matrices, 8 feature emission wavelengths were extracted using PCA combined Wilks Λ-statistic, and subsequently fluorescence excitation-emission matrices corresponding to the feature emission wavelengths was fused by FDA, and the first three FD variables with a cumulative discriminant power of 99% were selected as feature vectors. Finally, the extreme learning machine (ELM) and random forest (RF) models were constructed for the sweet basil origin identification, and the back propagation neural network (BPNN) algorithm was employed for the rapid prediction of linalool and flavonoids in sweet basil. The results showed that compared with the RF model, the ELM model was more suitable for identifying sweet basil from different sources, with an accuracy rate of 98%. The coefficient of determination (R2) and root mean square error (RMSE) of the linalool content prediction model based on BPNN were 0.984 and 0.131, respectively. The R2 and RMSE of the BPNN flavonoids content prediction model based on BPNN were 0.969 and 0.019, respectively. The above results indicated that the suggested feature extraction method showed good generalization ability and robustness, which provides an alternative feature selection method for the rapid identification of the food source and the evaluation of food quality.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Food Measurement and Characterization
Journal of Food Measurement and Characterization Agricultural and Biological Sciences-Food Science
CiteScore
6.00
自引率
11.80%
发文量
425
期刊介绍: This interdisciplinary journal publishes new measurement results, characteristic properties, differentiating patterns, measurement methods and procedures for such purposes as food process innovation, product development, quality control, and safety assurance. The journal encompasses all topics related to food property measurement and characterization, including all types of measured properties of food and food materials, features and patterns, measurement principles and techniques, development and evaluation of technologies, novel uses and applications, and industrial implementation of systems and procedures.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信