Metz Maxime , Khadija Lamdibih , Jean-Michel Roger , David Esteve , Ryad Bendoula , Florent Abdelghafour
{"title":"Simple methods for uncertainty estimation in neural networks applied to spectral data processing: A case study on mango dry matter prediction","authors":"Metz Maxime , Khadija Lamdibih , Jean-Michel Roger , David Esteve , Ryad Bendoula , Florent Abdelghafour","doi":"10.1016/j.chemolab.2025.105532","DOIUrl":null,"url":null,"abstract":"<div><div>The growing complexity of real-world chemometric applications, particularly in spectroscopy, has exposed the limitations of traditional linear models in capturing non-linear patterns in spectral data. Deep learning models offer a powerful alternative but remain underutilised in chemometrics due to concerns about interpretability and trust, particularly in high-risk applications where uncertainty estimation is critical. This study investigates and compares three uncertainty estimation techniques suitable for neural networks: Monte Carlo Dropout (MC dropout), model averaging, and Stochastic Weight Averaging-Gaussian (SWAG). These methods are evaluated using a spectral deep learning architecture. The analysis focuses on identifying key hyper-parameters affecting both predictive performance and uncertainty calibration. Results show that while MC Dropout offers a good balance between accuracy and uncertainty estimation at low computational cost, model averaging provides robust performance but at the expense of greater training time and storage. SWAG emerges as a middle-ground method requiring careful tuning. Importantly, a trade-off between predictive accuracy and uncertainty calibration is observed, underscoring the need to consider uncertainty as an integral part of model evaluation. These findings highlight the relevance of deep learning uncertainty estimation in chemometrics and open new directions for optimising data acquisition, model calibration, and model selection based on both prediction confidence and performance.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105532"},"PeriodicalIF":3.8000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743925002175","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The growing complexity of real-world chemometric applications, particularly in spectroscopy, has exposed the limitations of traditional linear models in capturing non-linear patterns in spectral data. Deep learning models offer a powerful alternative but remain underutilised in chemometrics due to concerns about interpretability and trust, particularly in high-risk applications where uncertainty estimation is critical. This study investigates and compares three uncertainty estimation techniques suitable for neural networks: Monte Carlo Dropout (MC dropout), model averaging, and Stochastic Weight Averaging-Gaussian (SWAG). These methods are evaluated using a spectral deep learning architecture. The analysis focuses on identifying key hyper-parameters affecting both predictive performance and uncertainty calibration. Results show that while MC Dropout offers a good balance between accuracy and uncertainty estimation at low computational cost, model averaging provides robust performance but at the expense of greater training time and storage. SWAG emerges as a middle-ground method requiring careful tuning. Importantly, a trade-off between predictive accuracy and uncertainty calibration is observed, underscoring the need to consider uncertainty as an integral part of model evaluation. These findings highlight the relevance of deep learning uncertainty estimation in chemometrics and open new directions for optimising data acquisition, model calibration, and model selection based on both prediction confidence and performance.
期刊介绍:
Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines.
Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data.
The journal deals with the following topics:
1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.)
2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered.
3) Development of new software that provides novel tools or truly advances the use of chemometrical methods.
4) Well characterized data sets to test performance for the new methods and software.
The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.