Comparison of machine learning models for classifying edible oils using Fourier-transform infrared spectroscopy

IF 1.7 4区 化学
Hyeona Lim, Seon Yeong Lee, Jin Young Kim, Yeon Ju Shin, Yerin Jang, Hyeonjin Kim, Byung Hee Kim, Sangdoo Ahn
{"title":"Comparison of machine learning models for classifying edible oils using Fourier-transform infrared spectroscopy","authors":"Hyeona Lim,&nbsp;Seon Yeong Lee,&nbsp;Jin Young Kim,&nbsp;Yeon Ju Shin,&nbsp;Yerin Jang,&nbsp;Hyeonjin Kim,&nbsp;Byung Hee Kim,&nbsp;Sangdoo Ahn","doi":"10.1002/bkcs.12932","DOIUrl":null,"url":null,"abstract":"<p>Accurate classification and authentication of edible oils are essential for maintaining product quality, ensuring consumer safety, and preserving market integrity. Therefore, this study aims to propose Fourier-transform infrared (FT-IR) spectroscopy, combined with advanced machine learning models, as a rapid and non-destructive technique for classifying edible oils. The FT-IR spectra of seven edible oil types were analyzed across three spectral regions: the full range, the C-H stretching range, and the fingerprint region. Both absorbance and second derivative spectra were used to evaluate the influence of spectral preprocessing on classification accuracy. Six machine learning models—principal component analysis followed by linear discriminant analysis (PCA-LDA), k-nearest neighbors, decision tree, random forest, eXtreme Gradient Boosting, and support vector machines (SVM)—were employed to classify the oils, achieving training accuracies of 96.4%–100% and testing accuracies of 88.1%–100%. The second derivative spectra enhanced model performance by improving the resolution of overlapping peaks, particularly in the C<span></span>H and C<span></span>O stretching regions. Additionally, the SHapley Additive exPlanations analysis further revealed the most critical spectral features influencing model predictions, offering valuable insights into the decision-making processes. This study demonstrates the effectiveness of combining FT-IR spectroscopy, second derivative preprocessing, and machine learning techniques for classifying edible oils. The findings highlight the benefits of second derivative spectra in enhancing spectral resolution and the superior classification performance of PCA-LDA and SVM models. These results offer a robust framework for advancing edible oil analysis and emphasize the potential of artificial intelligence in food authentication and quality control.</p>","PeriodicalId":54252,"journal":{"name":"Bulletin of the Korean Chemical Society","volume":"46 2","pages":"131-137"},"PeriodicalIF":1.7000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of the Korean Chemical Society","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/bkcs.12932","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate classification and authentication of edible oils are essential for maintaining product quality, ensuring consumer safety, and preserving market integrity. Therefore, this study aims to propose Fourier-transform infrared (FT-IR) spectroscopy, combined with advanced machine learning models, as a rapid and non-destructive technique for classifying edible oils. The FT-IR spectra of seven edible oil types were analyzed across three spectral regions: the full range, the C-H stretching range, and the fingerprint region. Both absorbance and second derivative spectra were used to evaluate the influence of spectral preprocessing on classification accuracy. Six machine learning models—principal component analysis followed by linear discriminant analysis (PCA-LDA), k-nearest neighbors, decision tree, random forest, eXtreme Gradient Boosting, and support vector machines (SVM)—were employed to classify the oils, achieving training accuracies of 96.4%–100% and testing accuracies of 88.1%–100%. The second derivative spectra enhanced model performance by improving the resolution of overlapping peaks, particularly in the CH and CO stretching regions. Additionally, the SHapley Additive exPlanations analysis further revealed the most critical spectral features influencing model predictions, offering valuable insights into the decision-making processes. This study demonstrates the effectiveness of combining FT-IR spectroscopy, second derivative preprocessing, and machine learning techniques for classifying edible oils. The findings highlight the benefits of second derivative spectra in enhancing spectral resolution and the superior classification performance of PCA-LDA and SVM models. These results offer a robust framework for advancing edible oil analysis and emphasize the potential of artificial intelligence in food authentication and quality control.

傅里叶变换红外光谱用于食用油分类的机器学习模型比较
食用油的准确分类和认证对于保持产品质量、确保消费者安全、维护市场诚信至关重要。因此,本研究旨在提出傅里叶变换红外(FT-IR)光谱结合先进的机器学习模型,作为一种快速、无损的食用油分类技术。对7种食用油的FT-IR光谱进行了全范围、C-H拉伸范围和指纹区分析。利用吸收光谱和二阶导数光谱来评价光谱预处理对分类精度的影响。采用主成分分析、线性判别分析(PCA-LDA)、k近邻、决策树、随机森林、极端梯度增强和支持向量机(SVM)等6种机器学习模型对油品进行分类,训练准确率为96.4% ~ 100%,测试准确率为88.1% ~ 100%。二阶导数光谱通过提高重叠峰的分辨率来增强模型的性能,特别是在C - H和C - O拉伸区。此外,SHapley加性解释分析进一步揭示了影响模型预测的最关键的光谱特征,为决策过程提供了有价值的见解。本研究证明了FT-IR光谱、二阶导数预处理和机器学习技术相结合对食用油分类的有效性。研究结果表明二阶导数光谱在提高光谱分辨率方面的优势以及PCA-LDA和SVM模型的优越分类性能。这些结果为推进食用油分析提供了一个强有力的框架,并强调了人工智能在食品认证和质量控制方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Bulletin of the Korean Chemical Society
Bulletin of the Korean Chemical Society Chemistry-General Chemistry
自引率
23.50%
发文量
182
期刊介绍: The Bulletin of the Korean Chemical Society is an official research journal of the Korean Chemical Society. It was founded in 1980 and reaches out to the chemical community worldwide. It is strictly peer-reviewed and welcomes Accounts, Communications, Articles, and Notes written in English. The scope of the journal covers all major areas of chemistry: analytical chemistry, electrochemistry, industrial chemistry, inorganic chemistry, life-science chemistry, macromolecular chemistry, organic synthesis, non-synthetic organic chemistry, physical chemistry, and materials chemistry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信