Mengmeng Wang;Liang Zhou;Kaiyue Zhang;Xinghua Li;Ming Hao;Yuanxin Ye
{"title":"ESAM-CD: Fine-Tuned EfficientSAM Network With LoRA for Weakly Supervised Remote Sensing Image Change Detection","authors":"Mengmeng Wang;Liang Zhou;Kaiyue Zhang;Xinghua Li;Ming Hao;Yuanxin Ye","doi":"10.1109/TGRS.2024.3470808","DOIUrl":null,"url":null,"abstract":"Change detection (CD) has become an attractive research topic in the field of remote sensing imagery in recent years. Despite significant advancements driven by deep learning (DL) techniques, most current methods predominantly rely on fully supervised strategies. These methods require the collection of a large number of pixel-level labels, which is quite time consuming and label intensive. To address that, we propose a weakly supervised CD method with EfficientSAM (ESAM)-CD termed, which leverages multiscale class activation map (CAM) fusion and a fine-tuned EfficientSAM’s image encoder. First, we construct a classification model employing image-level labels with a deep supervision strategy to generate high-quality multiscale CAM. Subsequently, a multiscale CAM fusion module is proposed to refine the boundaries of change targets by harnessing information from various scales. Then, we utilize EfficientSAM with powerful generalization capabilities as the backbone and fine-tune it using a low-rank adaptation (LoRA) strategy to establish a CD network. In such a network, bitemporal images and the generated pseudolabels are fed into the network. In addition, to overcome the reliance of EfficientSAM’s decoder on prompts, we propose a prompt-free decoder based on the general convolutional layers to predict change maps. Finally, we validate the effectiveness of the proposed ESAM-CD using two publicly available CD datasets (i.e., WHU-CD and LEVIR-CD). Comprehensive experiments demonstrate that our method outperforms other weakly supervised CD methods, achieving outstanding performance on both datasets.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-16"},"PeriodicalIF":8.6000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10700770/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Change detection (CD) has become an attractive research topic in the field of remote sensing imagery in recent years. Despite significant advancements driven by deep learning (DL) techniques, most current methods predominantly rely on fully supervised strategies. These methods require the collection of a large number of pixel-level labels, which is quite time consuming and label intensive. To address that, we propose a weakly supervised CD method with EfficientSAM (ESAM)-CD termed, which leverages multiscale class activation map (CAM) fusion and a fine-tuned EfficientSAM’s image encoder. First, we construct a classification model employing image-level labels with a deep supervision strategy to generate high-quality multiscale CAM. Subsequently, a multiscale CAM fusion module is proposed to refine the boundaries of change targets by harnessing information from various scales. Then, we utilize EfficientSAM with powerful generalization capabilities as the backbone and fine-tune it using a low-rank adaptation (LoRA) strategy to establish a CD network. In such a network, bitemporal images and the generated pseudolabels are fed into the network. In addition, to overcome the reliance of EfficientSAM’s decoder on prompts, we propose a prompt-free decoder based on the general convolutional layers to predict change maps. Finally, we validate the effectiveness of the proposed ESAM-CD using two publicly available CD datasets (i.e., WHU-CD and LEVIR-CD). Comprehensive experiments demonstrate that our method outperforms other weakly supervised CD methods, achieving outstanding performance on both datasets.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.