In the search for optimal multi-view learning models for crop classification with global remote sensing data

IF 8.6 Q1 REMOTE SENSING

International journal of applied earth observation and geoinformation : ITC journal Pub Date : 2025-09-01 DOI:10.1016/j.jag.2025.104823

Francisco Mena , Diego Arenas , Andreas Dengel

{"title":"In the search for optimal multi-view learning models for crop classification with global remote sensing data","authors":"Francisco Mena , Diego Arenas , Andreas Dengel","doi":"10.1016/j.jag.2025.104823","DOIUrl":null,"url":null,"abstract":"<div><div>Studying and analyzing cropland is a difficult task due to its dynamic and heterogeneous growth behavior. Usually, diverse data sources can be collected for its estimation. Although deep learning models have proven to excel in the crop classification task, they face substantial challenges when dealing with multiple inputs, named multi-modal or Multi-View Learning (MVL). The methods used in the MVL scenario can be structured based on the encoder architecture, the fusion strategy, and the optimization technique. Here, the literature has primarily focused on using specific encoder architectures for local regions, lacking a deeper exploration of other components in the MVL methodology. In contrast, we investigate the simultaneous selection of the fusion strategy and encoder architecture, assessing global-scale cropland and crop-type classifications. We use a range of five fusion strategies (Input, Feature, Decision, Ensemble, Hybrid) and five temporal encoders (LSTM, GRU, TempCNN, TAE, L-TAE) as possible configurations in the MVL method. We use the CropHarvest dataset for validation, which provides optical, radar, weather time series, and topographic information as input data. We found that in scenarios with a limited number of labeled samples, a unique configuration is insufficient for all cases. Instead, a specific combination should be meticulously sought, including an encoder and fusion strategy. To streamline this search process, we suggest identifying the optimal encoder architecture tailored for a particular fusion strategy and then determining the most suitable fusion strategy for the classification task. We provide a standardized model schema for the exploration of crop classification through an MVL methodology. jn</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"143 ","pages":"Article 104823"},"PeriodicalIF":8.6000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843225004704","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}

引用次数: 0

Abstract

Studying and analyzing cropland is a difficult task due to its dynamic and heterogeneous growth behavior. Usually, diverse data sources can be collected for its estimation. Although deep learning models have proven to excel in the crop classification task, they face substantial challenges when dealing with multiple inputs, named multi-modal or Multi-View Learning (MVL). The methods used in the MVL scenario can be structured based on the encoder architecture, the fusion strategy, and the optimization technique. Here, the literature has primarily focused on using specific encoder architectures for local regions, lacking a deeper exploration of other components in the MVL methodology. In contrast, we investigate the simultaneous selection of the fusion strategy and encoder architecture, assessing global-scale cropland and crop-type classifications. We use a range of five fusion strategies (Input, Feature, Decision, Ensemble, Hybrid) and five temporal encoders (LSTM, GRU, TempCNN, TAE, L-TAE) as possible configurations in the MVL method. We use the CropHarvest dataset for validation, which provides optical, radar, weather time series, and topographic information as input data. We found that in scenarios with a limited number of labeled samples, a unique configuration is insufficient for all cases. Instead, a specific combination should be meticulously sought, including an encoder and fusion strategy. To streamline this search process, we suggest identifying the optimal encoder architecture tailored for a particular fusion strategy and then determining the most suitable fusion strategy for the classification task. We provide a standardized model schema for the exploration of crop classification through an MVL methodology. jn

查看原文本刊更多论文

基于全球遥感数据的作物分类多视图学习模型的优化研究

由于耕地的生长动态和异质性，对其进行研究和分析是一项艰巨的任务。通常，可以收集不同的数据源对其进行估计。尽管深度学习模型已被证明在作物分类任务中表现出色，但它们在处理多输入时面临着重大挑战，称为多模态或多视图学习（MVL）。MVL场景中使用的方法可以基于编码器结构、融合策略和优化技术进行结构化。在这里，文献主要集中在局部区域使用特定的编码器架构，缺乏对MVL方法中其他组件的深入探索。相比之下，我们研究了融合策略和编码器结构的同时选择，评估了全球尺度的耕地和作物类型分类。我们在MVL方法中使用了五种融合策略（输入、特征、决策、集成、混合）和五种时间编码器（LSTM、GRU、TempCNN、TAE、L-TAE）作为可能的配置。我们使用CropHarvest数据集进行验证，该数据集提供光学、雷达、天气时间序列和地形信息作为输入数据。我们发现，在标记样本数量有限的情况下，唯一配置不足以满足所有情况。相反，应该仔细寻找一个特定的组合，包括编码器和融合策略。为了简化搜索过程，我们建议确定适合特定融合策略的最佳编码器架构，然后确定最适合分类任务的融合策略。我们通过MVL方法为作物分类的探索提供了一个标准化的模型图式。约

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International journal of applied earth observation and geoinformation : ITC journal Global and Planetary Change, Management, Monitoring, Policy and Law, Earth-Surface Processes, Computers in Earth Sciences

CiteScore

12.00

自引率

0.00%

发文量

审稿时长

77 days

期刊介绍： The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.