t-SNE: A study on reducing the dimensionality of hyperspectral data for the regression problem of estimating oenological parameters

IF 8.2 Q1 AGRICULTURE, MULTIDISCIPLINARY
Rui Silva , Pedro Melo-Pinto
{"title":"t-SNE: A study on reducing the dimensionality of hyperspectral data for the regression problem of estimating oenological parameters","authors":"Rui Silva ,&nbsp;Pedro Melo-Pinto","doi":"10.1016/j.aiia.2023.02.003","DOIUrl":null,"url":null,"abstract":"<div><p>In recent years there is a growing importance in using machine learning techniques to improve procedures in precision agriculture: in this work we perform a study on models capable of predicting oenological parameters from hyperspectral images of wine grape berries, a specially relevant topic to boost production tasks for winemakers. Specifically, we explore the capabilities of a novel technique mostly used for visualization, t-Distributed Stochastic Neighbor Embedding (t-SNE), for reducing the dimensionality of the highly complex hyperspectral data and compare its performance with Principal Component Analysis (PCA) method, which despite the introduction of many nonlinear dimensionality reduction techniques over the years, had achieved the best results for real-world data across several studies in literature. Additionally we explore the potential of Kernel t-SNE, an extension to the t-SNE method that allows for the usage of the technique in streaming data or online scenarios. Our results show that, in a direct comparison, t-SNE achieves better metrics than PCA for most of the data sets in this work and that the regressor (Support Vector Regression, SVR) performs better with the t-SNE reduced features as inputs, accomplishing better predictions with lower error rates. Comparing the results with current literature, our shallow learning model paired with t-SNE achieves either better or on par results than those reported, even competing with more advanced models that use deep learning techniques, which should propel the introduction of t-SNE in more studies that require dimensionality reduction.</p></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"7 ","pages":"Pages 58-68"},"PeriodicalIF":8.2000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Agriculture","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589721723000053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 1

Abstract

In recent years there is a growing importance in using machine learning techniques to improve procedures in precision agriculture: in this work we perform a study on models capable of predicting oenological parameters from hyperspectral images of wine grape berries, a specially relevant topic to boost production tasks for winemakers. Specifically, we explore the capabilities of a novel technique mostly used for visualization, t-Distributed Stochastic Neighbor Embedding (t-SNE), for reducing the dimensionality of the highly complex hyperspectral data and compare its performance with Principal Component Analysis (PCA) method, which despite the introduction of many nonlinear dimensionality reduction techniques over the years, had achieved the best results for real-world data across several studies in literature. Additionally we explore the potential of Kernel t-SNE, an extension to the t-SNE method that allows for the usage of the technique in streaming data or online scenarios. Our results show that, in a direct comparison, t-SNE achieves better metrics than PCA for most of the data sets in this work and that the regressor (Support Vector Regression, SVR) performs better with the t-SNE reduced features as inputs, accomplishing better predictions with lower error rates. Comparing the results with current literature, our shallow learning model paired with t-SNE achieves either better or on par results than those reported, even competing with more advanced models that use deep learning techniques, which should propel the introduction of t-SNE in more studies that require dimensionality reduction.

t-SNE:葡萄酒参数估计回归问题的高光谱数据降维研究
近年来,使用机器学习技术来改进精准农业的程序变得越来越重要:在这项工作中,我们对能够从葡萄酒葡萄浆果的高光谱图像中预测酿酒参数的模型进行了研究,这是一个与提高酿酒师的生产任务特别相关的主题。具体而言,我们探索了一种主要用于可视化的新技术,t-分布式随机邻域嵌入(t-SNE),用于降低高度复杂的高光谱数据的维数的能力,并将其性能与主成分分析(PCA)方法进行了比较,尽管多年来引入了许多非线性降维技术,在文献中的几项研究中,获得了真实世界数据的最佳结果。此外,我们还探索了内核t-SNE的潜力,它是t-SNE方法的扩展,允许在流数据或在线场景中使用该技术。我们的结果表明,在直接比较中,对于本工作中的大多数数据集,t-SNE实现了比PCA更好的度量,并且回归器(支持向量回归,SVR)在将t-SNE减少的特征作为输入的情况下表现更好,以更低的错误率实现了更好的预测。将结果与当前文献进行比较,我们的浅层学习模型与t-SNE相结合,取得了比报道的更好或持平的结果,甚至与使用深度学习技术的更先进的模型相竞争,这将推动t-SNE在更多需要降维的研究中的引入。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial Intelligence in Agriculture
Artificial Intelligence in Agriculture Engineering-Engineering (miscellaneous)
CiteScore
21.60
自引率
0.00%
发文量
18
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信