Jaume Béjar-Grimalt, David Pérez-Guaita*, Ángel Sánchez-Illana*, Rodolfo García-Contreras, Rashmi Kataria, Sylvie Bureau, Miguel de la Guardia and Frédéric Cadet,
{"title":"Classification of Apricot Varieties by Infrared Spectroscopy and Machine Learning","authors":"Jaume Béjar-Grimalt, David Pérez-Guaita*, Ángel Sánchez-Illana*, Rodolfo García-Contreras, Rashmi Kataria, Sylvie Bureau, Miguel de la Guardia and Frédéric Cadet, ","doi":"10.1021/acsagscitech.5c00068","DOIUrl":null,"url":null,"abstract":"<p >This work aimed to investigate using ATR–FTIR spectroscopy combined with machine learning to classify eight apricot varieties. Traditionally, variety identification relies on physicochemical property measurements, which are time-consuming and require laboratory analysis. Instead, we used the ATR–FTIR spectra from 731 apricots divided into calibration (512) and test (219) sets and three machine learning models (i.e., partial least-squares-discriminant analysis (PLS-DA), support vector machine (SVM), and random forest (RF)) to accurately predict 97% of the test samples. Additionally, careful inspection of the PLS-DA regression vectors revealed a strong correlation between the spectra and biochemical composition in sugar and organic acids, validating ATR–FTIR spectroscopy as a viable alternative for variety identification. Finally, to validate the results, additional models were constructed using the physicochemical data from the apricots. These reference models were then tested using the same data splits as the spectroscopic data used as a reference method, obtaining similar results with both approaches.</p>","PeriodicalId":93846,"journal":{"name":"ACS agricultural science & technology","volume":"5 7","pages":"1373–1381"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12309246/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS agricultural science & technology","FirstCategoryId":"1085","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acsagscitech.5c00068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
This work aimed to investigate using ATR–FTIR spectroscopy combined with machine learning to classify eight apricot varieties. Traditionally, variety identification relies on physicochemical property measurements, which are time-consuming and require laboratory analysis. Instead, we used the ATR–FTIR spectra from 731 apricots divided into calibration (512) and test (219) sets and three machine learning models (i.e., partial least-squares-discriminant analysis (PLS-DA), support vector machine (SVM), and random forest (RF)) to accurately predict 97% of the test samples. Additionally, careful inspection of the PLS-DA regression vectors revealed a strong correlation between the spectra and biochemical composition in sugar and organic acids, validating ATR–FTIR spectroscopy as a viable alternative for variety identification. Finally, to validate the results, additional models were constructed using the physicochemical data from the apricots. These reference models were then tested using the same data splits as the spectroscopic data used as a reference method, obtaining similar results with both approaches.