{"title":"基于近红外光谱和数据增强的脐橙SSC非线性在线分类方法","authors":"Shaohui Yu , Jing Liu","doi":"10.1016/j.postharvbio.2025.113990","DOIUrl":null,"url":null,"abstract":"<div><div>As a non-invasive detection method, near-infrared spectroscopy (NIR) has demonstrated significant potential for application in assessing fruit quality and sorting. However, during the online fruit sorting process, multiple factors affect the sorting accuracy. To address the challenges of limited sample size, heterogeneous quality, and the intricate nonlinear relationship between detection indices and spectral data in online sorting, this paper presents a data augmentation approach for the online sorting of navel oranges based on the soluble solids content (SSC), which integrates error rate and probability mass weighting. Firstly, cluster analysis was performed on the SSC, and the R<sup>2</sup> statistic and the elbow rule were introduced to determine the optimal number of clusters. The training set and test set were divided using the Monte Carlo random sampling method. Subsequently, the training set was augmented by incorporating spectral data that had undergone moving average smoothing, thereby forming an enhanced sample set. Furthermore, the probability mass and error rate of training set samples were integrated to formulate the sample weight coefficient. At last, the augmented training set was employed to establish a classification model via a three-layer neural network. Multiple experimental results showed that this method significantly improves classification performance for online sorting data, and the classification accuracy exceeds 90 %.</div></div>","PeriodicalId":20328,"journal":{"name":"Postharvest Biology and Technology","volume":"232 ","pages":"Article 113990"},"PeriodicalIF":6.8000,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A nonlinear classification method for online sorting of navel orange SSC based on near-infrared spectroscopy and data augmentation\",\"authors\":\"Shaohui Yu , Jing Liu\",\"doi\":\"10.1016/j.postharvbio.2025.113990\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>As a non-invasive detection method, near-infrared spectroscopy (NIR) has demonstrated significant potential for application in assessing fruit quality and sorting. However, during the online fruit sorting process, multiple factors affect the sorting accuracy. To address the challenges of limited sample size, heterogeneous quality, and the intricate nonlinear relationship between detection indices and spectral data in online sorting, this paper presents a data augmentation approach for the online sorting of navel oranges based on the soluble solids content (SSC), which integrates error rate and probability mass weighting. Firstly, cluster analysis was performed on the SSC, and the R<sup>2</sup> statistic and the elbow rule were introduced to determine the optimal number of clusters. The training set and test set were divided using the Monte Carlo random sampling method. Subsequently, the training set was augmented by incorporating spectral data that had undergone moving average smoothing, thereby forming an enhanced sample set. Furthermore, the probability mass and error rate of training set samples were integrated to formulate the sample weight coefficient. At last, the augmented training set was employed to establish a classification model via a three-layer neural network. Multiple experimental results showed that this method significantly improves classification performance for online sorting data, and the classification accuracy exceeds 90 %.</div></div>\",\"PeriodicalId\":20328,\"journal\":{\"name\":\"Postharvest Biology and Technology\",\"volume\":\"232 \",\"pages\":\"Article 113990\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Postharvest Biology and Technology\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925521425006027\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRONOMY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Postharvest Biology and Technology","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925521425006027","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
A nonlinear classification method for online sorting of navel orange SSC based on near-infrared spectroscopy and data augmentation
As a non-invasive detection method, near-infrared spectroscopy (NIR) has demonstrated significant potential for application in assessing fruit quality and sorting. However, during the online fruit sorting process, multiple factors affect the sorting accuracy. To address the challenges of limited sample size, heterogeneous quality, and the intricate nonlinear relationship between detection indices and spectral data in online sorting, this paper presents a data augmentation approach for the online sorting of navel oranges based on the soluble solids content (SSC), which integrates error rate and probability mass weighting. Firstly, cluster analysis was performed on the SSC, and the R2 statistic and the elbow rule were introduced to determine the optimal number of clusters. The training set and test set were divided using the Monte Carlo random sampling method. Subsequently, the training set was augmented by incorporating spectral data that had undergone moving average smoothing, thereby forming an enhanced sample set. Furthermore, the probability mass and error rate of training set samples were integrated to formulate the sample weight coefficient. At last, the augmented training set was employed to establish a classification model via a three-layer neural network. Multiple experimental results showed that this method significantly improves classification performance for online sorting data, and the classification accuracy exceeds 90 %.
期刊介绍:
The journal is devoted exclusively to the publication of original papers, review articles and frontiers articles on biological and technological postharvest research. This includes the areas of postharvest storage, treatments and underpinning mechanisms, quality evaluation, packaging, handling and distribution of fresh horticultural crops including fruit, vegetables, flowers and nuts, but excluding grains, seeds and forages.
Papers reporting novel insights from fundamental and interdisciplinary research will be particularly encouraged. These disciplines include systems biology, bioinformatics, entomology, plant physiology, plant pathology, (bio)chemistry, engineering, modelling, and technologies for nondestructive testing.
Manuscripts on fresh food crops that will be further processed after postharvest storage, or on food processes beyond refrigeration, packaging and minimal processing will not be considered.