Comparative study of indirect and direct feature extraction algorithms in classifying tea varieties using near-infrared spectroscopy.

IF 7 2区农林科学 Q1 FOOD SCIENCE & TECHNOLOGY

Current Research in Food Science Pub Date : 2025-04-30 eCollection Date: 2025-01-01 DOI:10.1016/j.crfs.2025.101065

Xuefan Zhou, Xiaohong Wu, Bin Wu

{"title":"Comparative study of indirect and direct feature extraction algorithms in classifying tea varieties using near-infrared spectroscopy.","authors":"Xuefan Zhou, Xiaohong Wu, Bin Wu","doi":"10.1016/j.crfs.2025.101065","DOIUrl":null,"url":null,"abstract":"<p><p>Tea, a globally cherished beverage, has become an integral part of daily life, particularly in China. Given the extensive variety of teas, each distinguished by unique price points, flavors, and health benefits, effective classification within the tea industry is crucial to address the diverse preferences of consumers. This study utilized indirect and direct feature extraction algorithms to analyze the Near-Infrared (NIR) spectra of various tea varieties and compared their classification outcomes. Principal Component Analysis (PCA) was employed as a dimensionality reduction technique for indirect feature extraction algorithms. The study began with the collection of NIR spectra from different tea varieties, followed by the application of three spectral preprocessing algorithms. Indirect and direct feature extraction algorithms were then used to reduce the dimensionality of the preprocessed data. A K-Nearest Neighbors (KNN) classifier analyzed the dimensionality-reduced data to determine classification accuracy. The findings revealed that the classification accuracies of indirect feature extraction algorithms consistently exceeded those of direct feature extraction algorithms, with the former generally surpassing 90.0 %, while the latter remained lower. This indicates that indirect feature extraction algorithms are more adept at handling complex spectral data. A significant decline in classification accuracy was observed when data were processed with Savitzky-Golay (SG). An in-depth analysis led to the development of an optimization plan incorporating the Successive Projections Algorithm (SPA), which effectively enhanced all classification accuracies to above 90 %.</p>","PeriodicalId":10939,"journal":{"name":"Current Research in Food Science","volume":"10 ","pages":"101065"},"PeriodicalIF":7.0000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099700/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Research in Food Science","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1016/j.crfs.2025.101065","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Tea, a globally cherished beverage, has become an integral part of daily life, particularly in China. Given the extensive variety of teas, each distinguished by unique price points, flavors, and health benefits, effective classification within the tea industry is crucial to address the diverse preferences of consumers. This study utilized indirect and direct feature extraction algorithms to analyze the Near-Infrared (NIR) spectra of various tea varieties and compared their classification outcomes. Principal Component Analysis (PCA) was employed as a dimensionality reduction technique for indirect feature extraction algorithms. The study began with the collection of NIR spectra from different tea varieties, followed by the application of three spectral preprocessing algorithms. Indirect and direct feature extraction algorithms were then used to reduce the dimensionality of the preprocessed data. A K-Nearest Neighbors (KNN) classifier analyzed the dimensionality-reduced data to determine classification accuracy. The findings revealed that the classification accuracies of indirect feature extraction algorithms consistently exceeded those of direct feature extraction algorithms, with the former generally surpassing 90.0 %, while the latter remained lower. This indicates that indirect feature extraction algorithms are more adept at handling complex spectral data. A significant decline in classification accuracy was observed when data were processed with Savitzky-Golay (SG). An in-depth analysis led to the development of an optimization plan incorporating the Successive Projections Algorithm (SPA), which effectively enhanced all classification accuracies to above 90 %.

查看原文本刊更多论文

近红外光谱茶叶品种分类中间接与直接特征提取算法的比较研究。

茶，一种深受全球喜爱的饮料，已经成为人们日常生活中不可或缺的一部分，尤其是在中国。鉴于茶叶种类繁多，每一种都有独特的价格点、口味和健康益处，茶叶行业内的有效分类对于解决消费者的不同偏好至关重要。本研究采用间接和直接特征提取算法对不同茶叶品种的近红外光谱进行分析，并比较其分类结果。采用主成分分析（PCA）作为间接特征提取算法的降维技术。本研究首先收集了不同茶叶品种的近红外光谱，然后应用了三种光谱预处理算法。然后使用间接和直接特征提取算法对预处理数据进行降维。k近邻（KNN）分类器分析降维数据以确定分类精度。研究结果表明，间接特征提取算法的分类准确率始终高于直接特征提取算法，前者的分类准确率普遍超过90%，而后者的分类准确率较低。这表明间接特征提取算法更擅长处理复杂的光谱数据。当使用Savitzky-Golay （SG）处理数据时，发现分类精度显著下降。通过深入分析，我们开发了一种包含连续投影算法（SPA）的优化方案，有效地将所有分类准确率提高到90%以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Current Research in Food Science Agricultural and Biological Sciences-Food Science

CiteScore

7.40

自引率

3.20%

发文量

232

审稿时长

84 days

期刊介绍： Current Research in Food Science is an international peer-reviewed journal dedicated to advancing the breadth of knowledge in the field of food science. It serves as a platform for publishing original research articles and short communications that encompass a wide array of topics, including food chemistry, physics, microbiology, nutrition, nutraceuticals, process and package engineering, materials science, food sustainability, and food security. By covering these diverse areas, the journal aims to provide a comprehensive source of the latest scientific findings and technological advancements that are shaping the future of the food industry. The journal's scope is designed to address the multidisciplinary nature of food science, reflecting its commitment to promoting innovation and ensuring the safety and quality of the food supply.