{"title":"Optimizing soft sensor costs through feature selection: A comparative study of sensory and chemical parameters in wine grade prediction","authors":"Jingxian An , Zhipeng Zhang","doi":"10.1016/j.chemolab.2025.105404","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional wine grade evaluation, typically conducted by world-renowned wine experts, was found to disadvantage emerging wineries due to its restrictive and time-consuming nature. This study proposed an alternative approach using soft sensors to predict wine grades, investigating the cost-effectiveness of both chemical and sensory evaluation methods through various machine learning approaches. A dataset of 23 unique wine samples in duplicate (totaling 46 bottles of New Zealand Pinot Noir wines), classified across all five stars of the Jukes-Stelzer system, was analyzed using 13 chemical parameters and 35 sensory attributes. The research employed classification algorithms, including naïve Bayes, k-nearest neighbors, decision trees, and support vector machines, to predict wine grades. Additionally, multiple feature selection methods—such as PCA distance analysis, ensemble tree-based feature selection, decision tree-based feature selection, Fisher score, relief-F score analysis, and one-way ANOVA—were used to identify the most significant predictive variables while minimizing analytical costs. Results demonstrated that chemical parameters, particularly those related to wine color and total phenolics, served as strong indicators of wine grade, with soft sensors using all 13 chemical parameters achieving prediction accuracies up to 93.48 %. Sensory attributes, particularly oak influence and tertiary aromas related to wine storage, also proved to be effective predictors. Soft sensors utilizing all 35 sensory attributes achieved accuracies of 97.83 %. Through feature selection methods, costs could be reduced by up to 100 % while maintaining acceptable prediction accuracy (above 65 %). Similarly, accuracies above 65 % were achieved using sensory attributes as input data, alongside a 97 % cost reduction. Additionally, in scenarios where chemical measurements were taken only once and sensory attributes were evaluated by a single wine expert, a comparative cost analysis revealed that sensory attributes were more economical for high-accuracy predictions (>70 %), while chemical parameters proved more cost-effective for moderate accuracy levels (<70 %). For higher accuracy requirements (>70 %), sensory evaluation emerged as the optimal choice, offering both high accuracy and cost-effectiveness. This study proposed a practical framework for cost-effective wine grade prediction methods that could benefit both established and emerging wine producers, offering an accessible alternative to traditional expert-based evaluation systems.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"262 ","pages":"Article 105404"},"PeriodicalIF":3.7000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743925000899","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Traditional wine grade evaluation, typically conducted by world-renowned wine experts, was found to disadvantage emerging wineries due to its restrictive and time-consuming nature. This study proposed an alternative approach using soft sensors to predict wine grades, investigating the cost-effectiveness of both chemical and sensory evaluation methods through various machine learning approaches. A dataset of 23 unique wine samples in duplicate (totaling 46 bottles of New Zealand Pinot Noir wines), classified across all five stars of the Jukes-Stelzer system, was analyzed using 13 chemical parameters and 35 sensory attributes. The research employed classification algorithms, including naïve Bayes, k-nearest neighbors, decision trees, and support vector machines, to predict wine grades. Additionally, multiple feature selection methods—such as PCA distance analysis, ensemble tree-based feature selection, decision tree-based feature selection, Fisher score, relief-F score analysis, and one-way ANOVA—were used to identify the most significant predictive variables while minimizing analytical costs. Results demonstrated that chemical parameters, particularly those related to wine color and total phenolics, served as strong indicators of wine grade, with soft sensors using all 13 chemical parameters achieving prediction accuracies up to 93.48 %. Sensory attributes, particularly oak influence and tertiary aromas related to wine storage, also proved to be effective predictors. Soft sensors utilizing all 35 sensory attributes achieved accuracies of 97.83 %. Through feature selection methods, costs could be reduced by up to 100 % while maintaining acceptable prediction accuracy (above 65 %). Similarly, accuracies above 65 % were achieved using sensory attributes as input data, alongside a 97 % cost reduction. Additionally, in scenarios where chemical measurements were taken only once and sensory attributes were evaluated by a single wine expert, a comparative cost analysis revealed that sensory attributes were more economical for high-accuracy predictions (>70 %), while chemical parameters proved more cost-effective for moderate accuracy levels (<70 %). For higher accuracy requirements (>70 %), sensory evaluation emerged as the optimal choice, offering both high accuracy and cost-effectiveness. This study proposed a practical framework for cost-effective wine grade prediction methods that could benefit both established and emerging wine producers, offering an accessible alternative to traditional expert-based evaluation systems.
期刊介绍:
Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines.
Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data.
The journal deals with the following topics:
1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.)
2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered.
3) Development of new software that provides novel tools or truly advances the use of chemometrical methods.
4) Well characterized data sets to test performance for the new methods and software.
The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.