ESAM-CD: Fine-Tuned EfficientSAM Network With LoRA for Weakly Supervised Remote Sensing Image Change Detection

IF 8.6 1区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Mengmeng Wang;Liang Zhou;Kaiyue Zhang;Xinghua Li;Ming Hao;Yuanxin Ye
{"title":"ESAM-CD: Fine-Tuned EfficientSAM Network With LoRA for Weakly Supervised Remote Sensing Image Change Detection","authors":"Mengmeng Wang;Liang Zhou;Kaiyue Zhang;Xinghua Li;Ming Hao;Yuanxin Ye","doi":"10.1109/TGRS.2024.3470808","DOIUrl":null,"url":null,"abstract":"Change detection (CD) has become an attractive research topic in the field of remote sensing imagery in recent years. Despite significant advancements driven by deep learning (DL) techniques, most current methods predominantly rely on fully supervised strategies. These methods require the collection of a large number of pixel-level labels, which is quite time consuming and label intensive. To address that, we propose a weakly supervised CD method with EfficientSAM (ESAM)-CD termed, which leverages multiscale class activation map (CAM) fusion and a fine-tuned EfficientSAM’s image encoder. First, we construct a classification model employing image-level labels with a deep supervision strategy to generate high-quality multiscale CAM. Subsequently, a multiscale CAM fusion module is proposed to refine the boundaries of change targets by harnessing information from various scales. Then, we utilize EfficientSAM with powerful generalization capabilities as the backbone and fine-tune it using a low-rank adaptation (LoRA) strategy to establish a CD network. In such a network, bitemporal images and the generated pseudolabels are fed into the network. In addition, to overcome the reliance of EfficientSAM’s decoder on prompts, we propose a prompt-free decoder based on the general convolutional layers to predict change maps. Finally, we validate the effectiveness of the proposed ESAM-CD using two publicly available CD datasets (i.e., WHU-CD and LEVIR-CD). Comprehensive experiments demonstrate that our method outperforms other weakly supervised CD methods, achieving outstanding performance on both datasets.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-16"},"PeriodicalIF":8.6000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10700770/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Change detection (CD) has become an attractive research topic in the field of remote sensing imagery in recent years. Despite significant advancements driven by deep learning (DL) techniques, most current methods predominantly rely on fully supervised strategies. These methods require the collection of a large number of pixel-level labels, which is quite time consuming and label intensive. To address that, we propose a weakly supervised CD method with EfficientSAM (ESAM)-CD termed, which leverages multiscale class activation map (CAM) fusion and a fine-tuned EfficientSAM’s image encoder. First, we construct a classification model employing image-level labels with a deep supervision strategy to generate high-quality multiscale CAM. Subsequently, a multiscale CAM fusion module is proposed to refine the boundaries of change targets by harnessing information from various scales. Then, we utilize EfficientSAM with powerful generalization capabilities as the backbone and fine-tune it using a low-rank adaptation (LoRA) strategy to establish a CD network. In such a network, bitemporal images and the generated pseudolabels are fed into the network. In addition, to overcome the reliance of EfficientSAM’s decoder on prompts, we propose a prompt-free decoder based on the general convolutional layers to predict change maps. Finally, we validate the effectiveness of the proposed ESAM-CD using two publicly available CD datasets (i.e., WHU-CD and LEVIR-CD). Comprehensive experiments demonstrate that our method outperforms other weakly supervised CD methods, achieving outstanding performance on both datasets.
ESAM-CD:用于弱监督遥感图像变化检测的带有 LoRA 的微调 EfficientSAM 网络
近年来,变化检测(CD)已成为遥感图像领域颇具吸引力的研究课题。尽管在深度学习(DL)技术的推动下取得了重大进展,但目前大多数方法主要依赖于完全监督策略。这些方法需要收集大量像素级标签,相当耗时且标签密集。为了解决这个问题,我们提出了一种采用 EfficientSAM(ESAM)-CD 术语的弱监督 CD 方法,该方法利用了多尺度类激活图(CAM)融合和微调的 EfficientSAM 图像编码器。首先,我们利用图像级标签构建分类模型,采用深度监督策略生成高质量的多尺度 CAM。随后,我们提出了一个多尺度 CAM 融合模块,通过利用来自不同尺度的信息来细化变化目标的边界。然后,我们利用具有强大泛化能力的 EfficientSAM 作为骨干,并使用低秩适应(LoRA)策略对其进行微调,以建立 CD 网络。在这样一个网络中,位时图像和生成的伪标签被输入到网络中。此外,为了克服 EfficientSAM 解码器对提示的依赖,我们提出了基于一般卷积层的无提示解码器来预测变化图。最后,我们使用两个公开的 CD 数据集(即 WHU-CD 和 LEVIR-CD)验证了所提出的 ESAM-CD 的有效性。综合实验证明,我们的方法优于其他弱监督 CD 方法,在这两个数据集上都取得了出色的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Geoscience and Remote Sensing
IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理
CiteScore
11.50
自引率
28.00%
发文量
1912
审稿时长
4.0 months
期刊介绍: IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信