基于番茄红素含量的机器学习驱动型便携式可见光-全可见光-近红外分光光度计对生番茄进行无损分类

IF 2.7 3区 化学 Q2 CHEMISTRY, ANALYTICAL
Arun Sharma , Ritesh Kumar , Nishant Kumar , Vikas Saxena
{"title":"基于番茄红素含量的机器学习驱动型便携式可见光-全可见光-近红外分光光度计对生番茄进行无损分类","authors":"Arun Sharma ,&nbsp;Ritesh Kumar ,&nbsp;Nishant Kumar ,&nbsp;Vikas Saxena","doi":"10.1016/j.vibspec.2023.103628","DOIUrl":null,"url":null,"abstract":"<div><p>Most of the research on intact fruit spectroscopy is derivative in nature as it primarily showcase application of existing spectroscopy devices which are often proprietary in nature. The regression models developed by researchers to predict physicochemical attributes using spectra remain theoretical due to lack of mechanism to integrate the developed models back into proprietary devices. This poses challenge for commercial adaptation of this technology in commercial food quality supply chain. The present study addresses this research gap by presenting first of its kind innovative approach to classify tomatoes based on lycopene content using chemometrics-machine learning framework driven portable short-wave near infra-red (SWNIR) spectrophotometer developed by integration of open-source hardware (AS7265x multispectral chipset having wavelength range 410–940 nanometre (nm), Arduino Uno microcontroller) and software (R platform), housed in ergonomically designed and 3-dimension printed cabinet ensuring noise-free spectra acquisition. The lycopene content was observed to have strong negative correlation with wavelengths (nm) 485, 560 and 585 at ρ = – 0.65, – 0.70, – 0.70, whereas strong positive correlation with 760 nm at ρ = +0.64. Similar associations were qualitatively observed using principal component analysis. Atypical of literature, feature selection was performed based on analysis of variance and 14 wavelengths which exhibited statistically significant difference with respect to 15-days storage study (p ≤ 0.05) were selected for model development. Chemometrics-machine learning framework was used for development of optimised probabilistic and non-probabilistic models including logistic regression, Linear Discriminant Analysis (LDA), Random Forest (RF), Artificial Neural Networks (ANN) and Support Vector Machine (SVM) models using 10-fold cross validation subjected to 80–20% train-test split of the dataset. In agreement with literature, 500–750 nm wavelength range dominated the classification of lycopene content. Notably, specific wavelengths for logistic regression (560 nm), LDA (730 nm, 645 nm, 560 nm, 535 nm), RF (760 nm, 585 nm, 560 nm, 645 nm), and ANN (585 nm, 560 nm) significantly influenced outcome instances across classifiers. Accuracy obtained from confusion matrix on test dataset was used as performance metric to compare different models. Logistic regression and RF showcased accuracy of 80%, LDA and SVM at 90% while ANN outperformed all models with accuracy of 95%. This study successfully augmented technological advancement in field of spectroscopy for non-invasive quality assessment of fruit. It is recommended to conduct similar studies on other climacteric fruits for wider adoption of this technology.</p></div>","PeriodicalId":23656,"journal":{"name":"Vibrational Spectroscopy","volume":"130 ","pages":"Article 103628"},"PeriodicalIF":2.7000,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0924203123001352/pdfft?md5=935231dd9459a701dd5ac3a36975cc14&pid=1-s2.0-S0924203123001352-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Machine learning driven portable Vis-SWNIR spectrophotometer for non-destructive classification of raw tomatoes based on lycopene content\",\"authors\":\"Arun Sharma ,&nbsp;Ritesh Kumar ,&nbsp;Nishant Kumar ,&nbsp;Vikas Saxena\",\"doi\":\"10.1016/j.vibspec.2023.103628\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Most of the research on intact fruit spectroscopy is derivative in nature as it primarily showcase application of existing spectroscopy devices which are often proprietary in nature. The regression models developed by researchers to predict physicochemical attributes using spectra remain theoretical due to lack of mechanism to integrate the developed models back into proprietary devices. This poses challenge for commercial adaptation of this technology in commercial food quality supply chain. The present study addresses this research gap by presenting first of its kind innovative approach to classify tomatoes based on lycopene content using chemometrics-machine learning framework driven portable short-wave near infra-red (SWNIR) spectrophotometer developed by integration of open-source hardware (AS7265x multispectral chipset having wavelength range 410–940 nanometre (nm), Arduino Uno microcontroller) and software (R platform), housed in ergonomically designed and 3-dimension printed cabinet ensuring noise-free spectra acquisition. The lycopene content was observed to have strong negative correlation with wavelengths (nm) 485, 560 and 585 at ρ = – 0.65, – 0.70, – 0.70, whereas strong positive correlation with 760 nm at ρ = +0.64. Similar associations were qualitatively observed using principal component analysis. Atypical of literature, feature selection was performed based on analysis of variance and 14 wavelengths which exhibited statistically significant difference with respect to 15-days storage study (p ≤ 0.05) were selected for model development. Chemometrics-machine learning framework was used for development of optimised probabilistic and non-probabilistic models including logistic regression, Linear Discriminant Analysis (LDA), Random Forest (RF), Artificial Neural Networks (ANN) and Support Vector Machine (SVM) models using 10-fold cross validation subjected to 80–20% train-test split of the dataset. In agreement with literature, 500–750 nm wavelength range dominated the classification of lycopene content. Notably, specific wavelengths for logistic regression (560 nm), LDA (730 nm, 645 nm, 560 nm, 535 nm), RF (760 nm, 585 nm, 560 nm, 645 nm), and ANN (585 nm, 560 nm) significantly influenced outcome instances across classifiers. Accuracy obtained from confusion matrix on test dataset was used as performance metric to compare different models. Logistic regression and RF showcased accuracy of 80%, LDA and SVM at 90% while ANN outperformed all models with accuracy of 95%. This study successfully augmented technological advancement in field of spectroscopy for non-invasive quality assessment of fruit. It is recommended to conduct similar studies on other climacteric fruits for wider adoption of this technology.</p></div>\",\"PeriodicalId\":23656,\"journal\":{\"name\":\"Vibrational Spectroscopy\",\"volume\":\"130 \",\"pages\":\"Article 103628\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2023-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0924203123001352/pdfft?md5=935231dd9459a701dd5ac3a36975cc14&pid=1-s2.0-S0924203123001352-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vibrational Spectroscopy\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0924203123001352\",\"RegionNum\":3,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vibrational Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924203123001352","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

摘要

大多数完整水果光谱的研究本质上是衍生的,因为它主要展示了现有光谱设备的应用,这些设备通常是专有的。由于缺乏将开发的模型整合回专有设备的机制,研究人员开发的利用光谱预测物理化学属性的回归模型仍然是理论性的。这对该技术在商业食品质量供应链中的商业化应用提出了挑战。本研究通过首次采用化学计量学-机器学习框架驱动的便携式短波近红外(SWNIR)分光光度计,提出了基于番茄红素含量对番茄进行分类的创新方法,从而解决了这一研究空白,该分光光度计集成了开源硬件(波长范围为410-940纳米(nm)的AS7265x多光谱芯片组,Arduino Uno微控制器)和软件(R平台)。安置在符合人体工程学设计和三维印刷柜,确保无噪声的光谱采集。在ρ = - 0.65、- 0.70、- 0.70时,番茄红素含量与波长485、560、585呈显著负相关,与波长760 nm ρ = +0.64呈显著正相关。使用主成分分析定性观察到类似的关联。与文献不同的是,我们根据方差分析进行特征选择,选择了14个与15天存储研究有统计学差异(p≤0.05)的波长进行模型开发。使用化学计量学-机器学习框架开发优化的概率和非概率模型,包括逻辑回归、线性判别分析(LDA)、随机森林(RF)、人工神经网络(ANN)和支持向量机(SVM)模型,使用10倍交叉验证,对数据集进行80-20%训练测试分割。与文献一致,500 ~ 750 nm波长范围是番茄红素含量的主要分类。值得注意的是,逻辑回归的特定波长(560 nm)、LDA (730 nm、645 nm、560 nm、535 nm)、RF (760 nm、585 nm、560 nm、645 nm)和ANN (585 nm、560 nm)显著影响了分类器的结果实例。从测试数据集上的混淆矩阵得到的准确率作为性能指标来比较不同的模型。Logistic回归和RF的准确率为80%,LDA和SVM的准确率为90%,而ANN的准确率为95%。该研究成功地推动了果品无创品质评价光谱技术的发展。建议对其他更年期水果进行类似的研究,以推广该技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Machine learning driven portable Vis-SWNIR spectrophotometer for non-destructive classification of raw tomatoes based on lycopene content

Machine learning driven portable Vis-SWNIR spectrophotometer for non-destructive classification of raw tomatoes based on lycopene content

Most of the research on intact fruit spectroscopy is derivative in nature as it primarily showcase application of existing spectroscopy devices which are often proprietary in nature. The regression models developed by researchers to predict physicochemical attributes using spectra remain theoretical due to lack of mechanism to integrate the developed models back into proprietary devices. This poses challenge for commercial adaptation of this technology in commercial food quality supply chain. The present study addresses this research gap by presenting first of its kind innovative approach to classify tomatoes based on lycopene content using chemometrics-machine learning framework driven portable short-wave near infra-red (SWNIR) spectrophotometer developed by integration of open-source hardware (AS7265x multispectral chipset having wavelength range 410–940 nanometre (nm), Arduino Uno microcontroller) and software (R platform), housed in ergonomically designed and 3-dimension printed cabinet ensuring noise-free spectra acquisition. The lycopene content was observed to have strong negative correlation with wavelengths (nm) 485, 560 and 585 at ρ = – 0.65, – 0.70, – 0.70, whereas strong positive correlation with 760 nm at ρ = +0.64. Similar associations were qualitatively observed using principal component analysis. Atypical of literature, feature selection was performed based on analysis of variance and 14 wavelengths which exhibited statistically significant difference with respect to 15-days storage study (p ≤ 0.05) were selected for model development. Chemometrics-machine learning framework was used for development of optimised probabilistic and non-probabilistic models including logistic regression, Linear Discriminant Analysis (LDA), Random Forest (RF), Artificial Neural Networks (ANN) and Support Vector Machine (SVM) models using 10-fold cross validation subjected to 80–20% train-test split of the dataset. In agreement with literature, 500–750 nm wavelength range dominated the classification of lycopene content. Notably, specific wavelengths for logistic regression (560 nm), LDA (730 nm, 645 nm, 560 nm, 535 nm), RF (760 nm, 585 nm, 560 nm, 645 nm), and ANN (585 nm, 560 nm) significantly influenced outcome instances across classifiers. Accuracy obtained from confusion matrix on test dataset was used as performance metric to compare different models. Logistic regression and RF showcased accuracy of 80%, LDA and SVM at 90% while ANN outperformed all models with accuracy of 95%. This study successfully augmented technological advancement in field of spectroscopy for non-invasive quality assessment of fruit. It is recommended to conduct similar studies on other climacteric fruits for wider adoption of this technology.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Vibrational Spectroscopy
Vibrational Spectroscopy 化学-分析化学
CiteScore
4.70
自引率
4.00%
发文量
103
审稿时长
52 days
期刊介绍: Vibrational Spectroscopy provides a vehicle for the publication of original research that focuses on vibrational spectroscopy. This covers infrared, near-infrared and Raman spectroscopies and publishes papers dealing with developments in applications, theory, techniques and instrumentation. The topics covered by the journal include: Sampling techniques, Vibrational spectroscopy coupled with separation techniques, Instrumentation (Fourier transform, conventional and laser based), Data manipulation, Spectra-structure correlation and group frequencies. The application areas covered include: Analytical chemistry, Bio-organic and bio-inorganic chemistry, Organic chemistry, Inorganic chemistry, Catalysis, Environmental science, Industrial chemistry, Materials science, Physical chemistry, Polymer science, Process control, Specialized problem solving.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信