Enhancing lettuce classification: Optimizing spectral wavelength selection via CCARS and PLS-DA

IF 6.3 Q1 AGRICULTURAL ENGINEERING
Nicola Dilillo , Andrea Sanna , Elena Belcore , Kyra Smith , Marco Piras , Bartolomeo Montrucchio , Renato Ferrero
{"title":"Enhancing lettuce classification: Optimizing spectral wavelength selection via CCARS and PLS-DA","authors":"Nicola Dilillo ,&nbsp;Andrea Sanna ,&nbsp;Elena Belcore ,&nbsp;Kyra Smith ,&nbsp;Marco Piras ,&nbsp;Bartolomeo Montrucchio ,&nbsp;Renato Ferrero","doi":"10.1016/j.atech.2025.100962","DOIUrl":null,"url":null,"abstract":"<div><div>Spectroscopy is a valuable tool for analyzing the inside of plants. In this field, plant health is evaluated through light analysis, specifically by examining wavelengths beyond the visible spectrum, making it essential to select the most representative wavelength. The Competitive Adaptive Reweighted Sampling (CARS) algorithm has been applied efficiently in the literature to select the best variables in several applications, including agricultural monitoring, nutrient analysis, and chemometrics. This study presents the Calibrated CARS (CCARS) algorithm, an extension of CARS, alongside the Partial Least Square Discriminant Analysis (PLS-DA) model. The algorithm is developed to identify critical informative wavelengths of a spectral dataset of lettuce to facilitate the creation of streamlined and efficient models for lettuce health classification. While effective with spectral data, the PLS-DA models tend to overfit, and to address this problem a rigorous systematic evaluation approach is employed. Permutation tests are conducted to verify the model's robustness, while learning curve analyses ensure the model's capacity to generalize data. With this comprehensive evaluation method, confidence in the robustness of the PLS-DA models is instilled, ensuring model stability, which is achieved thanks to the CCARS algorithm instead of the original version. The results demonstrate that using CCARS with 3 or 4 PLS components and only 30 or 19 selected wavelengths reduces the number of variables by 97%, without sacrificing accuracy, and with a statistically significant robust model.</div></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":"11 ","pages":"Article 100962"},"PeriodicalIF":6.3000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375525001959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Spectroscopy is a valuable tool for analyzing the inside of plants. In this field, plant health is evaluated through light analysis, specifically by examining wavelengths beyond the visible spectrum, making it essential to select the most representative wavelength. The Competitive Adaptive Reweighted Sampling (CARS) algorithm has been applied efficiently in the literature to select the best variables in several applications, including agricultural monitoring, nutrient analysis, and chemometrics. This study presents the Calibrated CARS (CCARS) algorithm, an extension of CARS, alongside the Partial Least Square Discriminant Analysis (PLS-DA) model. The algorithm is developed to identify critical informative wavelengths of a spectral dataset of lettuce to facilitate the creation of streamlined and efficient models for lettuce health classification. While effective with spectral data, the PLS-DA models tend to overfit, and to address this problem a rigorous systematic evaluation approach is employed. Permutation tests are conducted to verify the model's robustness, while learning curve analyses ensure the model's capacity to generalize data. With this comprehensive evaluation method, confidence in the robustness of the PLS-DA models is instilled, ensuring model stability, which is achieved thanks to the CCARS algorithm instead of the original version. The results demonstrate that using CCARS with 3 or 4 PLS components and only 30 or 19 selected wavelengths reduces the number of variables by 97%, without sacrificing accuracy, and with a statistically significant robust model.

Abstract Image

增强生菜分类:通过CCARS和PLS-DA优化光谱波长选择
光谱学是分析植物内部的一种有价值的工具。在这一领域,植物健康是通过光分析来评估的,特别是通过检查可见光谱以外的波长,因此选择最具代表性的波长至关重要。竞争自适应重加权抽样(CARS)算法已被有效地应用于农业监测、营养分析和化学计量学等多个应用中,以选择最佳变量。本研究提出了校准CARS (CCARS)算法,这是CARS的扩展,以及偏最小二乘判别分析(PLS-DA)模型。该算法旨在识别生菜光谱数据集的关键信息波长,以促进生菜健康分类的流线型和高效模型的创建。PLS-DA模型虽然对光谱数据有效,但容易过拟合,为了解决这一问题,采用了严格的系统评估方法。置换测试验证了模型的鲁棒性,学习曲线分析确保了模型泛化数据的能力。通过这种综合评价方法,增强了对PLS-DA模型鲁棒性的信心,保证了模型的稳定性,这得益于CCARS算法而不是原始版本。结果表明,使用具有3或4个PLS分量和仅30或19个选定波长的CCARS可以在不牺牲精度的情况下减少97%的变量数量,并且具有统计上显着的鲁棒模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信