Mengmeng Wang;Liang Zhou;Kaiyue Zhang;Xinghua Li;Ming Hao;Yuanxin Ye
{"title":"ESAM-CD:用于弱监督遥感图像变化检测的带有 LoRA 的微调 EfficientSAM 网络","authors":"Mengmeng Wang;Liang Zhou;Kaiyue Zhang;Xinghua Li;Ming Hao;Yuanxin Ye","doi":"10.1109/TGRS.2024.3470808","DOIUrl":null,"url":null,"abstract":"Change detection (CD) has become an attractive research topic in the field of remote sensing imagery in recent years. Despite significant advancements driven by deep learning (DL) techniques, most current methods predominantly rely on fully supervised strategies. These methods require the collection of a large number of pixel-level labels, which is quite time consuming and label intensive. To address that, we propose a weakly supervised CD method with EfficientSAM (ESAM)-CD termed, which leverages multiscale class activation map (CAM) fusion and a fine-tuned EfficientSAM’s image encoder. First, we construct a classification model employing image-level labels with a deep supervision strategy to generate high-quality multiscale CAM. Subsequently, a multiscale CAM fusion module is proposed to refine the boundaries of change targets by harnessing information from various scales. Then, we utilize EfficientSAM with powerful generalization capabilities as the backbone and fine-tune it using a low-rank adaptation (LoRA) strategy to establish a CD network. In such a network, bitemporal images and the generated pseudolabels are fed into the network. In addition, to overcome the reliance of EfficientSAM’s decoder on prompts, we propose a prompt-free decoder based on the general convolutional layers to predict change maps. Finally, we validate the effectiveness of the proposed ESAM-CD using two publicly available CD datasets (i.e., WHU-CD and LEVIR-CD). Comprehensive experiments demonstrate that our method outperforms other weakly supervised CD methods, achieving outstanding performance on both datasets.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-16"},"PeriodicalIF":8.6000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ESAM-CD: Fine-Tuned EfficientSAM Network With LoRA for Weakly Supervised Remote Sensing Image Change Detection\",\"authors\":\"Mengmeng Wang;Liang Zhou;Kaiyue Zhang;Xinghua Li;Ming Hao;Yuanxin Ye\",\"doi\":\"10.1109/TGRS.2024.3470808\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Change detection (CD) has become an attractive research topic in the field of remote sensing imagery in recent years. Despite significant advancements driven by deep learning (DL) techniques, most current methods predominantly rely on fully supervised strategies. These methods require the collection of a large number of pixel-level labels, which is quite time consuming and label intensive. To address that, we propose a weakly supervised CD method with EfficientSAM (ESAM)-CD termed, which leverages multiscale class activation map (CAM) fusion and a fine-tuned EfficientSAM’s image encoder. First, we construct a classification model employing image-level labels with a deep supervision strategy to generate high-quality multiscale CAM. Subsequently, a multiscale CAM fusion module is proposed to refine the boundaries of change targets by harnessing information from various scales. Then, we utilize EfficientSAM with powerful generalization capabilities as the backbone and fine-tune it using a low-rank adaptation (LoRA) strategy to establish a CD network. In such a network, bitemporal images and the generated pseudolabels are fed into the network. In addition, to overcome the reliance of EfficientSAM’s decoder on prompts, we propose a prompt-free decoder based on the general convolutional layers to predict change maps. Finally, we validate the effectiveness of the proposed ESAM-CD using two publicly available CD datasets (i.e., WHU-CD and LEVIR-CD). Comprehensive experiments demonstrate that our method outperforms other weakly supervised CD methods, achieving outstanding performance on both datasets.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":\"62 \",\"pages\":\"1-16\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2024-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10700770/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10700770/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
近年来,变化检测(CD)已成为遥感图像领域颇具吸引力的研究课题。尽管在深度学习(DL)技术的推动下取得了重大进展,但目前大多数方法主要依赖于完全监督策略。这些方法需要收集大量像素级标签,相当耗时且标签密集。为了解决这个问题,我们提出了一种采用 EfficientSAM(ESAM)-CD 术语的弱监督 CD 方法,该方法利用了多尺度类激活图(CAM)融合和微调的 EfficientSAM 图像编码器。首先,我们利用图像级标签构建分类模型,采用深度监督策略生成高质量的多尺度 CAM。随后,我们提出了一个多尺度 CAM 融合模块,通过利用来自不同尺度的信息来细化变化目标的边界。然后,我们利用具有强大泛化能力的 EfficientSAM 作为骨干,并使用低秩适应(LoRA)策略对其进行微调,以建立 CD 网络。在这样一个网络中,位时图像和生成的伪标签被输入到网络中。此外,为了克服 EfficientSAM 解码器对提示的依赖,我们提出了基于一般卷积层的无提示解码器来预测变化图。最后,我们使用两个公开的 CD 数据集(即 WHU-CD 和 LEVIR-CD)验证了所提出的 ESAM-CD 的有效性。综合实验证明,我们的方法优于其他弱监督 CD 方法,在这两个数据集上都取得了出色的性能。
ESAM-CD: Fine-Tuned EfficientSAM Network With LoRA for Weakly Supervised Remote Sensing Image Change Detection
Change detection (CD) has become an attractive research topic in the field of remote sensing imagery in recent years. Despite significant advancements driven by deep learning (DL) techniques, most current methods predominantly rely on fully supervised strategies. These methods require the collection of a large number of pixel-level labels, which is quite time consuming and label intensive. To address that, we propose a weakly supervised CD method with EfficientSAM (ESAM)-CD termed, which leverages multiscale class activation map (CAM) fusion and a fine-tuned EfficientSAM’s image encoder. First, we construct a classification model employing image-level labels with a deep supervision strategy to generate high-quality multiscale CAM. Subsequently, a multiscale CAM fusion module is proposed to refine the boundaries of change targets by harnessing information from various scales. Then, we utilize EfficientSAM with powerful generalization capabilities as the backbone and fine-tune it using a low-rank adaptation (LoRA) strategy to establish a CD network. In such a network, bitemporal images and the generated pseudolabels are fed into the network. In addition, to overcome the reliance of EfficientSAM’s decoder on prompts, we propose a prompt-free decoder based on the general convolutional layers to predict change maps. Finally, we validate the effectiveness of the proposed ESAM-CD using two publicly available CD datasets (i.e., WHU-CD and LEVIR-CD). Comprehensive experiments demonstrate that our method outperforms other weakly supervised CD methods, achieving outstanding performance on both datasets.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.