TripleS：缓解高分辨率遥感图像语义变化检测中的多任务学习冲突

IF 12.2 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-09-27 DOI:10.1016/j.isprsjprs.2025.09.019

Xiaoliang Tan , Guanzhou Chen , Xiaodong Zhang , Tong Wang , Jiaqi Wang , Kui Wang , Tingxuan Miao

{"title":"TripleS：缓解高分辨率遥感图像语义变化检测中的多任务学习冲突","authors":"Xiaoliang Tan , Guanzhou Chen , Xiaodong Zhang , Tong Wang , Jiaqi Wang , Kui Wang , Tingxuan Miao","doi":"10.1016/j.isprsjprs.2025.09.019","DOIUrl":null,"url":null,"abstract":"<div><div>Periodical earth observation from multi-temporal high spatial resolution remote sensing imagery (RSI) offers valuable insights into the complex dynamics of land surface changes. Semantic change detection (SCD), cooperating with deep learning (DL) architectures, has evolved from binary change detection (BCD) into an effective technique capable of not only identifying change locations but also specifying land-cover and land-use (LCLU) categories. Recent advancements suggest that SCD can be modeled as a multi-task learning (MTL) framework, involving multiple branches for individual subtasks to process dual RSI inputs, and optimized through joint training. However, limitations remain in the inadequate interactions between bi-temporal branches and semantic-change branches, as well as the pervasive gradient conflicts among subtasks within MTL frameworks, which can lead to counterbalanced performances. To address the above limitations, we propose an MTL-oriented SCD model (MOSCD), which mutually enhances bi-temporal features, while ensuring that representations across the subtask branches are coherently correlated. Furthermore, the TripleS framework is designed to enhance the optimization of the MTL framework through counteracting the conflicting subtask objectives, which incorporates three novel schemes: Stepwise multi-task optimization, Selective parameter binding, and Scheduling for dynamically training MTL bindings. Extensive experiments conducted on three full-coverage land-cover SCD datasets, including one public dataset (HRSCD) and two self-constructed datasets (SC-SCD7 and CC-SCD5), demonstrate that the MOSCD enhanced with TripleS outperforms eleven existing SCD methods and three MTL methods by up to 21.17% on SeK metrics. The robust performances over diverse landscapes and transferability on other componentized benchmarks validate that the MOSCD trained with TripleS is a practicable tool for detecting subtle land-cover changes from high spatial resolution RSI data. Codes and the two constructed datasets will be available at <span><span>https://github.com/StephenApX/MTL-TripleS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"230 ","pages":"Pages 374-401"},"PeriodicalIF":12.2000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TripleS: Mitigating multi-task learning conflicts for semantic change detection in high-resolution remote sensing imagery\",\"authors\":\"Xiaoliang Tan , Guanzhou Chen , Xiaodong Zhang , Tong Wang , Jiaqi Wang , Kui Wang , Tingxuan Miao\",\"doi\":\"10.1016/j.isprsjprs.2025.09.019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Periodical earth observation from multi-temporal high spatial resolution remote sensing imagery (RSI) offers valuable insights into the complex dynamics of land surface changes. Semantic change detection (SCD), cooperating with deep learning (DL) architectures, has evolved from binary change detection (BCD) into an effective technique capable of not only identifying change locations but also specifying land-cover and land-use (LCLU) categories. Recent advancements suggest that SCD can be modeled as a multi-task learning (MTL) framework, involving multiple branches for individual subtasks to process dual RSI inputs, and optimized through joint training. However, limitations remain in the inadequate interactions between bi-temporal branches and semantic-change branches, as well as the pervasive gradient conflicts among subtasks within MTL frameworks, which can lead to counterbalanced performances. To address the above limitations, we propose an MTL-oriented SCD model (MOSCD), which mutually enhances bi-temporal features, while ensuring that representations across the subtask branches are coherently correlated. Furthermore, the TripleS framework is designed to enhance the optimization of the MTL framework through counteracting the conflicting subtask objectives, which incorporates three novel schemes: Stepwise multi-task optimization, Selective parameter binding, and Scheduling for dynamically training MTL bindings. Extensive experiments conducted on three full-coverage land-cover SCD datasets, including one public dataset (HRSCD) and two self-constructed datasets (SC-SCD7 and CC-SCD5), demonstrate that the MOSCD enhanced with TripleS outperforms eleven existing SCD methods and three MTL methods by up to 21.17% on SeK metrics. The robust performances over diverse landscapes and transferability on other componentized benchmarks validate that the MOSCD trained with TripleS is a practicable tool for detecting subtle land-cover changes from high spatial resolution RSI data. Codes and the two constructed datasets will be available at <span><span>https://github.com/StephenApX/MTL-TripleS</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50269,\"journal\":{\"name\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"volume\":\"230 \",\"pages\":\"Pages 374-401\"},\"PeriodicalIF\":12.2000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0924271625003776\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271625003776","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

摘要

基于多时相高空间分辨率遥感影像（RSI）的周期性地球观测为了解地表复杂的动态变化提供了有价值的见解。语义变化检测（SCD）与深度学习（DL）架构合作，已经从二元变化检测（BCD）发展成为一种有效的技术，不仅能够识别变化位置，而且能够指定土地覆盖和土地利用（LCLU）类别。最近的进展表明，SCD可以建模为一个多任务学习（MTL）框架，涉及多个分支的单个子任务来处理双重RSI输入，并通过联合训练进行优化。然而，限制仍然存在于双时态分支和语义更改分支之间的不充分的交互，以及MTL框架内子任务之间普遍存在的梯度冲突，这可能导致性能失衡。为了解决上述限制，我们提出了一个面向mtl的SCD模型（MOSCD），它相互增强双时态特征，同时确保跨子任务分支的表示是一致相关的。此外，TripleS框架旨在通过抵消子任务目标的冲突来增强MTL框架的优化能力，该框架采用了三种新颖的方案：逐步多任务优化、选择性参数绑定和动态训练MTL绑定的调度。在三个全覆盖土地覆盖SCD数据集(包括一个公共数据集（HRSCD）和两个自建数据集（SC-SCD7和CC-SCD5）上进行的大量实验表明，TripleS增强的MOSCD在SeK指标上优于现有的11种SCD方法和3种MTL方法，最高可达21.17%。在不同景观上的强大性能和在其他组件化基准上的可转移性验证了用TripleS训练的MOSCD是从高空间分辨率RSI数据中检测细微土地覆盖变化的实用工具。代码和两个构建的数据集可以在https://github.com/StephenApX/MTL-TripleS上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TripleS: Mitigating multi-task learning conflicts for semantic change detection in high-resolution remote sensing imagery

Periodical earth observation from multi-temporal high spatial resolution remote sensing imagery (RSI) offers valuable insights into the complex dynamics of land surface changes. Semantic change detection (SCD), cooperating with deep learning (DL) architectures, has evolved from binary change detection (BCD) into an effective technique capable of not only identifying change locations but also specifying land-cover and land-use (LCLU) categories. Recent advancements suggest that SCD can be modeled as a multi-task learning (MTL) framework, involving multiple branches for individual subtasks to process dual RSI inputs, and optimized through joint training. However, limitations remain in the inadequate interactions between bi-temporal branches and semantic-change branches, as well as the pervasive gradient conflicts among subtasks within MTL frameworks, which can lead to counterbalanced performances. To address the above limitations, we propose an MTL-oriented SCD model (MOSCD), which mutually enhances bi-temporal features, while ensuring that representations across the subtask branches are coherently correlated. Furthermore, the TripleS framework is designed to enhance the optimization of the MTL framework through counteracting the conflicting subtask objectives, which incorporates three novel schemes: Stepwise multi-task optimization, Selective parameter binding, and Scheduling for dynamically training MTL bindings. Extensive experiments conducted on three full-coverage land-cover SCD datasets, including one public dataset (HRSCD) and two self-constructed datasets (SC-SCD7 and CC-SCD5), demonstrate that the MOSCD enhanced with TripleS outperforms eleven existing SCD methods and three MTL methods by up to 21.17% on SeK metrics. The robust performances over diverse landscapes and transferability on other componentized benchmarks validate that the MOSCD trained with TripleS is a practicable tool for detecting subtle land-cover changes from high spatial resolution RSI data. Codes and the two constructed datasets will be available at https://github.com/StephenApX/MTL-TripleS.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术

CiteScore

21.00

自引率

6.30%

发文量

273

审稿时长

40 days

期刊介绍： The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.