{"title":"利用近红外光谱结合 SIMCA 和机器学习算法识别红辣椒粉的地理产地","authors":"Deepoo Meena, Somsubhra Chakraborty, Jayeeta Mitra","doi":"10.1007/s12161-024-02625-6","DOIUrl":null,"url":null,"abstract":"<div><p>Knowing the geographical origins of chili papers produced in specific areas is crucial because the geographical origins of various varieties of chili powder have a significant impact on their quality and price. In this research, for the first time, NIR (near-infrared) spectroscopy was used for the identification and classification of the geographical origin of chili powder of 6 different varieties, combining the method of PCA (principal component analysis) to extract relevant spectral features from the spectral data and segregate visible cluster trends, SIMCA (soft independent modeling of class analogy) statistically based classification model, and the four machine learning (ML) classifiers, including K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM), were applied for supervised classification. It was found that the SVM classifier, with a <i>C</i> value of 4013.0 and γ of 0.04125, delivered the highest cross-validation accuracy of 98.41% and prediction accuracy of 97.22%. The optimization process, guided by a detailed 3D contour plot, led to a model that not only generalized well but also offered remarkable precision, as confirmed by confusion matrices. The classification accuracy of the SIMCA model was 94.04% for the calibration set and 84.74% for the prediction set. The nonlinear SVM technique of classification outperformed the linear SIMCA model and other ML models. In general, the results indicated that chili powder from various geographic origins could be discriminated by the use of NIR spectroscopy combined with the SVM model quickly, nondestructively, and reliably.</p></div>","PeriodicalId":561,"journal":{"name":"Food Analytical Methods","volume":"17 7","pages":"1005 - 1023"},"PeriodicalIF":2.6000,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Geographical Origin Identification of Red Chili Powder Using NIR Spectroscopy Combined with SIMCA and Machine Learning Algorithms\",\"authors\":\"Deepoo Meena, Somsubhra Chakraborty, Jayeeta Mitra\",\"doi\":\"10.1007/s12161-024-02625-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Knowing the geographical origins of chili papers produced in specific areas is crucial because the geographical origins of various varieties of chili powder have a significant impact on their quality and price. In this research, for the first time, NIR (near-infrared) spectroscopy was used for the identification and classification of the geographical origin of chili powder of 6 different varieties, combining the method of PCA (principal component analysis) to extract relevant spectral features from the spectral data and segregate visible cluster trends, SIMCA (soft independent modeling of class analogy) statistically based classification model, and the four machine learning (ML) classifiers, including K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM), were applied for supervised classification. It was found that the SVM classifier, with a <i>C</i> value of 4013.0 and γ of 0.04125, delivered the highest cross-validation accuracy of 98.41% and prediction accuracy of 97.22%. The optimization process, guided by a detailed 3D contour plot, led to a model that not only generalized well but also offered remarkable precision, as confirmed by confusion matrices. The classification accuracy of the SIMCA model was 94.04% for the calibration set and 84.74% for the prediction set. The nonlinear SVM technique of classification outperformed the linear SIMCA model and other ML models. In general, the results indicated that chili powder from various geographic origins could be discriminated by the use of NIR spectroscopy combined with the SVM model quickly, nondestructively, and reliably.</p></div>\",\"PeriodicalId\":561,\"journal\":{\"name\":\"Food Analytical Methods\",\"volume\":\"17 7\",\"pages\":\"1005 - 1023\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Food Analytical Methods\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s12161-024-02625-6\",\"RegionNum\":3,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"FOOD SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Food Analytical Methods","FirstCategoryId":"97","ListUrlMain":"https://link.springer.com/article/10.1007/s12161-024-02625-6","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
Geographical Origin Identification of Red Chili Powder Using NIR Spectroscopy Combined with SIMCA and Machine Learning Algorithms
Knowing the geographical origins of chili papers produced in specific areas is crucial because the geographical origins of various varieties of chili powder have a significant impact on their quality and price. In this research, for the first time, NIR (near-infrared) spectroscopy was used for the identification and classification of the geographical origin of chili powder of 6 different varieties, combining the method of PCA (principal component analysis) to extract relevant spectral features from the spectral data and segregate visible cluster trends, SIMCA (soft independent modeling of class analogy) statistically based classification model, and the four machine learning (ML) classifiers, including K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM), were applied for supervised classification. It was found that the SVM classifier, with a C value of 4013.0 and γ of 0.04125, delivered the highest cross-validation accuracy of 98.41% and prediction accuracy of 97.22%. The optimization process, guided by a detailed 3D contour plot, led to a model that not only generalized well but also offered remarkable precision, as confirmed by confusion matrices. The classification accuracy of the SIMCA model was 94.04% for the calibration set and 84.74% for the prediction set. The nonlinear SVM technique of classification outperformed the linear SIMCA model and other ML models. In general, the results indicated that chili powder from various geographic origins could be discriminated by the use of NIR spectroscopy combined with the SVM model quickly, nondestructively, and reliably.
期刊介绍:
Food Analytical Methods publishes original articles, review articles, and notes on novel and/or state-of-the-art analytical methods or issues to be solved, as well as significant improvements or interesting applications to existing methods. These include analytical technology and methodology for food microbial contaminants, food chemistry and toxicology, food quality, food authenticity and food traceability. The journal covers fundamental and specific aspects of the development, optimization, and practical implementation in routine laboratories, and validation of food analytical methods for the monitoring of food safety and quality.