{"title":"Cross-Temporal Remote Sensing Image Change Captioning: A Manifold Mapping and Bayesian Diffusion Approach for Land Use Monitoring","authors":"Qingshan Bai;Xiaohua Wang","doi":"10.1109/JSTARS.2025.3575807","DOIUrl":null,"url":null,"abstract":"This study proposes a cross-temporal remote sensing image change captioning (RSICC) model named CTM, which is constructed based on manifold mapping and Bayesian diffusion techniques. The primary objective of CTM is to enhance the accuracy and robustness of captioning changes in multitemporal remote sensing images (RSIs). The model first employs manifold mapping to model illumination variations, reducing the impact of seasonal and lighting factors on image consistency. Subsequently, Bayesian diffusion is introduced to improve the modeling capability of cross-temporal image changes, enhancing robustness against noise and pseudo-changes. In addition, a dual-layer multicoding module is adopted to strengthen temporal feature representation, improving the perception of change regions. Finally, a difference enhancement and dual-attention based image-text captioning strategy is proposed to optimize feature selection and enhance the accuracy and detail of textual descriptions. Experimental results demonstrate that CTM exhibits greater robustness in handling long-span RSIs, effectively mitigating pseudo-changes caused by illumination and seasonal variations. On the LEVIR-CC dataset, CTM achieves a CIDEr score of 138.78, outperforming the best existing method by 7.38 points. On the WHU-CDC dataset, CTM achieves the highest performance in BLEU and METEOR metrics, with a CIDEr score of 153.29, showcasing its outstanding performance in RSICC tasks. Furthermore, visual analysis indicates that CTM accurately captures real change regions while significantly suppressing pseudo-changes, maintaining high descriptive accuracy even in complex environments. This study provides an efficient and precise solution for applications such as land use monitoring, environmental monitoring, and disaster response.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"14406-14415"},"PeriodicalIF":4.7000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11021286","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11021286/","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
This study proposes a cross-temporal remote sensing image change captioning (RSICC) model named CTM, which is constructed based on manifold mapping and Bayesian diffusion techniques. The primary objective of CTM is to enhance the accuracy and robustness of captioning changes in multitemporal remote sensing images (RSIs). The model first employs manifold mapping to model illumination variations, reducing the impact of seasonal and lighting factors on image consistency. Subsequently, Bayesian diffusion is introduced to improve the modeling capability of cross-temporal image changes, enhancing robustness against noise and pseudo-changes. In addition, a dual-layer multicoding module is adopted to strengthen temporal feature representation, improving the perception of change regions. Finally, a difference enhancement and dual-attention based image-text captioning strategy is proposed to optimize feature selection and enhance the accuracy and detail of textual descriptions. Experimental results demonstrate that CTM exhibits greater robustness in handling long-span RSIs, effectively mitigating pseudo-changes caused by illumination and seasonal variations. On the LEVIR-CC dataset, CTM achieves a CIDEr score of 138.78, outperforming the best existing method by 7.38 points. On the WHU-CDC dataset, CTM achieves the highest performance in BLEU and METEOR metrics, with a CIDEr score of 153.29, showcasing its outstanding performance in RSICC tasks. Furthermore, visual analysis indicates that CTM accurately captures real change regions while significantly suppressing pseudo-changes, maintaining high descriptive accuracy even in complex environments. This study provides an efficient and precise solution for applications such as land use monitoring, environmental monitoring, and disaster response.
期刊介绍:
The IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing addresses the growing field of applications in Earth observations and remote sensing, and also provides a venue for the rapidly expanding special issues that are being sponsored by the IEEE Geosciences and Remote Sensing Society. The journal draws upon the experience of the highly successful “IEEE Transactions on Geoscience and Remote Sensing” and provide a complementary medium for the wide range of topics in applied earth observations. The ‘Applications’ areas encompasses the societal benefit areas of the Global Earth Observations Systems of Systems (GEOSS) program. Through deliberations over two years, ministers from 50 countries agreed to identify nine areas where Earth observation could positively impact the quality of life and health of their respective countries. Some of these are areas not traditionally addressed in the IEEE context. These include biodiversity, health and climate. Yet it is the skill sets of IEEE members, in areas such as observations, communications, computers, signal processing, standards and ocean engineering, that form the technical underpinnings of GEOSS. Thus, the Journal attracts a broad range of interests that serves both present members in new ways and expands the IEEE visibility into new areas.