Shuai Liu, Honggao Liu, Jieqing Li, Yuanzhong Wang
{"title":"天麻红外光谱特征波段的人工筛选与算法筛选实现天麻品种的快速鉴定","authors":"Shuai Liu, Honggao Liu, Jieqing Li, Yuanzhong Wang","doi":"10.1002/cem.3641","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p><i>Gastrodia elata</i> is a traditional Chinese medicine with medicinal and edible values. In this paper, two kinds of datasets were acquired: partial spectra (artificially obtained peak segment spectra) and full spectra (4000–400 cm<sup>−1</sup>). Competitive adaptive reweighted sampling algorithm (CARS) and successive projection algorithm (SPA) were utilized to extract the characteristic variables of the two datasets, and Partial Least Squares Discriminant Analysis (PLS-DA) models, Support Vector Machines (SVM) models, Random Forests (RF) models, and Residual convolutional neural networks (ResNet) were established. It was found that among the PLS-DA models whole-MSC-CARS-PLS-DA was optimal, with a Root Mean Square Error of Prediction (RMSEP) of 0.0658; among the SVM models Partial-Standard Normal Variable (SNV-SPA-SVM was the best, with a kernel parameter of 0.1768 and the lowest number of support vectors; among the RF models Partial-SNV-RF is optimal, but not as effective as the first two models. The loss value of the ResNet model built based on effective information is 0.001, and the model building time is short and directly uses the original data. Therefore, the ResNet model based on feature bands is the most suitable for practical application compared with other models.</p>\n </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Artificial and Algorithmic Screening of Infrared Spectral Feature Bands of Gastrodia elata to Achieve Rapid Identification of Its Species\",\"authors\":\"Shuai Liu, Honggao Liu, Jieqing Li, Yuanzhong Wang\",\"doi\":\"10.1002/cem.3641\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p><i>Gastrodia elata</i> is a traditional Chinese medicine with medicinal and edible values. In this paper, two kinds of datasets were acquired: partial spectra (artificially obtained peak segment spectra) and full spectra (4000–400 cm<sup>−1</sup>). Competitive adaptive reweighted sampling algorithm (CARS) and successive projection algorithm (SPA) were utilized to extract the characteristic variables of the two datasets, and Partial Least Squares Discriminant Analysis (PLS-DA) models, Support Vector Machines (SVM) models, Random Forests (RF) models, and Residual convolutional neural networks (ResNet) were established. It was found that among the PLS-DA models whole-MSC-CARS-PLS-DA was optimal, with a Root Mean Square Error of Prediction (RMSEP) of 0.0658; among the SVM models Partial-Standard Normal Variable (SNV-SPA-SVM was the best, with a kernel parameter of 0.1768 and the lowest number of support vectors; among the RF models Partial-SNV-RF is optimal, but not as effective as the first two models. The loss value of the ResNet model built based on effective information is 0.001, and the model building time is short and directly uses the original data. Therefore, the ResNet model based on feature bands is the most suitable for practical application compared with other models.</p>\\n </div>\",\"PeriodicalId\":15274,\"journal\":{\"name\":\"Journal of Chemometrics\",\"volume\":\"39 1\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemometrics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cem.3641\",\"RegionNum\":4,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SOCIAL WORK\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemometrics","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cem.3641","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}
引用次数: 0
摘要
天麻是一种具有药用和食用价值的传统中药。本文获取了两类数据集:部分光谱(人工获得的峰段光谱)和全光谱(4000-400 cm−1)。利用竞争自适应重加权抽样算法(CARS)和逐次投影算法(SPA)提取两个数据集的特征变量,建立偏最小二乘判别分析(PLS-DA)模型、支持向量机(SVM)模型、随机森林(RF)模型和残差卷积神经网络(ResNet)模型。结果表明,全msc - cars -PLS-DA模型最优,预测均方根误差(RMSEP)为0.0658;其中部分标准正态变量支持向量机(Partial-Standard Normal Variable, SNV-SPA-SVM)的核参数为0.1768,支持向量数最少;其中Partial-SNV-RF模型最优,但效果不如前两种模型。基于有效信息构建的ResNet模型损失值为0.001,模型构建时间短,直接使用原始数据。因此,与其他模型相比,基于特征频带的ResNet模型最适合实际应用。
Artificial and Algorithmic Screening of Infrared Spectral Feature Bands of Gastrodia elata to Achieve Rapid Identification of Its Species
Gastrodia elata is a traditional Chinese medicine with medicinal and edible values. In this paper, two kinds of datasets were acquired: partial spectra (artificially obtained peak segment spectra) and full spectra (4000–400 cm−1). Competitive adaptive reweighted sampling algorithm (CARS) and successive projection algorithm (SPA) were utilized to extract the characteristic variables of the two datasets, and Partial Least Squares Discriminant Analysis (PLS-DA) models, Support Vector Machines (SVM) models, Random Forests (RF) models, and Residual convolutional neural networks (ResNet) were established. It was found that among the PLS-DA models whole-MSC-CARS-PLS-DA was optimal, with a Root Mean Square Error of Prediction (RMSEP) of 0.0658; among the SVM models Partial-Standard Normal Variable (SNV-SPA-SVM was the best, with a kernel parameter of 0.1768 and the lowest number of support vectors; among the RF models Partial-SNV-RF is optimal, but not as effective as the first two models. The loss value of the ResNet model built based on effective information is 0.001, and the model building time is short and directly uses the original data. Therefore, the ResNet model based on feature bands is the most suitable for practical application compared with other models.
期刊介绍:
The Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics. It also provides a forum for the exchange of information on meetings and other news relevant to the growing community of scientists who are interested in chemometrics and its applications. Short, critical review papers are a particularly important feature of the journal, in view of the multidisciplinary readership at which it is aimed.