A cost-effective and robust mapping method for diverse crop types using weakly supervised semantic segmentation with sparse point samples

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2024-09-20 DOI:10.1016/j.isprsjprs.2024.09.017

Zhiwen Cai , Baodong Xu , Qiangyi Yu , Xinyu Zhang , Jingya Yang , Haodong Wei , Shiqi Li , Qian Song , Hang Xiong , Hao Wu , Wenbin Wu , Zhihua Shi , Qiong Hu

{"title":"A cost-effective and robust mapping method for diverse crop types using weakly supervised semantic segmentation with sparse point samples","authors":"Zhiwen Cai , Baodong Xu , Qiangyi Yu , Xinyu Zhang , Jingya Yang , Haodong Wei , Shiqi Li , Qian Song , Hang Xiong , Hao Wu , Wenbin Wu , Zhihua Shi , Qiong Hu","doi":"10.1016/j.isprsjprs.2024.09.017","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate and timely information on the spatial distribution and areas of crop types is critical for yield estimation, agricultural management, and sustainable development. However, traditional crop classification methods often struggle to identify various crop types effectively due to their intricate spatiotemporal patterns and high training data demands. To address this challenge, we developed a <strong>Struct</strong>ure-aware <strong>Lab</strong>el e<strong>X</strong>pansion segmentation Network (StructLabX-Net) for diverse crop type mapping using limited point-annotated samples. StructLabX-Net features a backbone U-TempoNet, which combines CNNs and LSTM to explore intricate spatiotemporal patterns. It also incorporates multi-task weak supervision heads for edge detection and pseudo-label expansion, adding crucial structure and contextual insights. We tested the StructLabX-Net across three distinct regions in China, assessing over 10 crop types and comparing its performance against five popular classifiers based on multi-temporal Sentinel-2 images. The results showed that StructLabX-Net significantly outperformed RF, SVM, DeepCropMapping, Transformer, and patch-based CNN in identifying various crop types across three regions with sparse training samples. It achieved the highest overall accuracy and mean <em>F1-score</em>: 91.0% and 89.1% in Jianghan Plain, 91.5% and 90.7% in Songnen Plain, as well as 91.0% and 90.8% in Sanjiang Plain. StructLabX-Net demonstrated a particular advantage for those “hard types” characterized by limited samples and complex phenological features. Furthermore, ablation experiments highlight the crucial role of the “edge” head in guiding the model to accurately differentiate between various crop types with clearer class boundaries, and the “expansion” head in refining the understanding of target crops by providing extra details in pseudo-labels. Meanwhile, combining our backbone U-TempoNet with multi-task weak supervision heads exhibited superior results of crop type mapping than those derived by other segmentation models. Overall, StructLabX-Net maximizes the utilization of limited sparse samples from field surveys, offering a simple, cost-effective, and robust solution for accurately mapping various crop types at large scales. The code will be publicly available at <span><span>https://github.com/BruceKai/StructLabX-Net</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 260-276"},"PeriodicalIF":10.6000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271624003551","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate and timely information on the spatial distribution and areas of crop types is critical for yield estimation, agricultural management, and sustainable development. However, traditional crop classification methods often struggle to identify various crop types effectively due to their intricate spatiotemporal patterns and high training data demands. To address this challenge, we developed a Structure-aware Label eXpansion segmentation Network (StructLabX-Net) for diverse crop type mapping using limited point-annotated samples. StructLabX-Net features a backbone U-TempoNet, which combines CNNs and LSTM to explore intricate spatiotemporal patterns. It also incorporates multi-task weak supervision heads for edge detection and pseudo-label expansion, adding crucial structure and contextual insights. We tested the StructLabX-Net across three distinct regions in China, assessing over 10 crop types and comparing its performance against five popular classifiers based on multi-temporal Sentinel-2 images. The results showed that StructLabX-Net significantly outperformed RF, SVM, DeepCropMapping, Transformer, and patch-based CNN in identifying various crop types across three regions with sparse training samples. It achieved the highest overall accuracy and mean F1-score: 91.0% and 89.1% in Jianghan Plain, 91.5% and 90.7% in Songnen Plain, as well as 91.0% and 90.8% in Sanjiang Plain. StructLabX-Net demonstrated a particular advantage for those “hard types” characterized by limited samples and complex phenological features. Furthermore, ablation experiments highlight the crucial role of the “edge” head in guiding the model to accurately differentiate between various crop types with clearer class boundaries, and the “expansion” head in refining the understanding of target crops by providing extra details in pseudo-labels. Meanwhile, combining our backbone U-TempoNet with multi-task weak supervision heads exhibited superior results of crop type mapping than those derived by other segmentation models. Overall, StructLabX-Net maximizes the utilization of limited sparse samples from field surveys, offering a simple, cost-effective, and robust solution for accurately mapping various crop types at large scales. The code will be publicly available at https://github.com/BruceKai/StructLabX-Net.

查看原文本刊更多论文

利用带有稀疏点样本的弱监督语义分割技术，为不同作物类型提供经济高效且稳健的绘图方法

准确及时的作物类型空间分布和面积信息对于产量估算、农业管理和可持续发展至关重要。然而，传统的作物分类方法由于其错综复杂的时空模式和对训练数据的高要求，往往难以有效识别各种作物类型。为了应对这一挑战，我们开发了结构感知标签扩展分割网络（StructLabX-Net），利用有限的点标注样本绘制多种作物类型图。StructLabX-Net 以 U-TempoNet 为骨干，结合了 CNN 和 LSTM 来探索复杂的时空模式。它还结合了用于边缘检测和伪标签扩展的多任务弱监督头，增加了重要的结构和上下文洞察力。我们在中国三个不同地区测试了 StructLabX-Net，评估了 10 多种作物类型，并将其性能与基于多时相 Sentinel-2 图像的五种流行分类器进行了比较。结果表明，StructLabX-Net 在识别三个地区的各种作物类型时，明显优于 RF、SVM、DeepCropMapping、Transformer 和基于补丁的 CNN（训练样本稀少）。它取得了最高的总体准确率和平均 F1 分数：江汉平原分别为 91.0% 和 89.1%，松嫩平原分别为 91.5% 和 90.7%，三江平原分别为 91.0% 和 90.8%。StructLabX-Net 对样本有限、物候特征复杂的 "硬类型 "具有特别的优势。此外，消融实验凸显了 "边缘 "头和 "扩展 "头的重要作用。"边缘 "头引导模型以更清晰的类别边界准确区分各种作物类型，而 "扩展 "头则通过提供伪标签中的额外细节来完善对目标作物的理解。同时，将我们的骨干 U-TempoNet 与多任务弱监督头相结合，作物类型映射结果优于其他分割模型。总之，StructLabX-Net 最大限度地利用了田野调查中有限的稀疏样本，为在大尺度上精确绘制各种作物类型提供了一个简单、经济、稳健的解决方案。代码将在 https://github.com/BruceKai/StructLabX-Net 上公开发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术

CiteScore

21.00

自引率

6.30%

发文量

273

审稿时长

40 days

期刊介绍： The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.