{"title":"3D-HRSCD:利用三维卷积挖掘多尺度特征的潜力","authors":"Yue Song;Sheng Fang;Zhe Li;Su Wang;Enyi Zhao","doi":"10.1109/LGRS.2025.3591276","DOIUrl":null,"url":null,"abstract":"Semantic change detection (SCD) in remote sensing image (RSI) is critical for monitoring land cover and land-use transformations. Although existing SCD methods have made progress in modeling temporal dependency, they still struggle to effectively capture multiscale features and make interaction among them. To address these issues, we propose 3D-HRSCD, a novel architecture that utilizes 3-D convolution to model temporal dependency across HRNet’s multiresolution features. The core of this architecture is 3-D convolution fusion oriented to multiscale (3DFOM) features module, which makes adequate interaction in channel, spatial, and temporal dimensions across multiscale features. To support more efficient temporal dependency modeling in 3DFOM, cosine similarity-based temporal multiscales attention (CTMAs) module serves as a preprocessing stage by enhancing features in change regions. Additionally, comprehensive semantic consistency (CSC) loss function is introduced to further suppress pseudo-changes and reduce semantic recognition errors. Experimental results reveal that our method outperforms state-of-the-art (SOTA) performances relative to previous SCD efforts.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"3D-HRSCD: Exploiting the Potential of Multiscale Features by 3-D Convolution\",\"authors\":\"Yue Song;Sheng Fang;Zhe Li;Su Wang;Enyi Zhao\",\"doi\":\"10.1109/LGRS.2025.3591276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic change detection (SCD) in remote sensing image (RSI) is critical for monitoring land cover and land-use transformations. Although existing SCD methods have made progress in modeling temporal dependency, they still struggle to effectively capture multiscale features and make interaction among them. To address these issues, we propose 3D-HRSCD, a novel architecture that utilizes 3-D convolution to model temporal dependency across HRNet’s multiresolution features. The core of this architecture is 3-D convolution fusion oriented to multiscale (3DFOM) features module, which makes adequate interaction in channel, spatial, and temporal dimensions across multiscale features. To support more efficient temporal dependency modeling in 3DFOM, cosine similarity-based temporal multiscales attention (CTMAs) module serves as a preprocessing stage by enhancing features in change regions. Additionally, comprehensive semantic consistency (CSC) loss function is introduced to further suppress pseudo-changes and reduce semantic recognition errors. Experimental results reveal that our method outperforms state-of-the-art (SOTA) performances relative to previous SCD efforts.\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11087592/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11087592/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
3D-HRSCD: Exploiting the Potential of Multiscale Features by 3-D Convolution
Semantic change detection (SCD) in remote sensing image (RSI) is critical for monitoring land cover and land-use transformations. Although existing SCD methods have made progress in modeling temporal dependency, they still struggle to effectively capture multiscale features and make interaction among them. To address these issues, we propose 3D-HRSCD, a novel architecture that utilizes 3-D convolution to model temporal dependency across HRNet’s multiresolution features. The core of this architecture is 3-D convolution fusion oriented to multiscale (3DFOM) features module, which makes adequate interaction in channel, spatial, and temporal dimensions across multiscale features. To support more efficient temporal dependency modeling in 3DFOM, cosine similarity-based temporal multiscales attention (CTMAs) module serves as a preprocessing stage by enhancing features in change regions. Additionally, comprehensive semantic consistency (CSC) loss function is introduced to further suppress pseudo-changes and reduce semantic recognition errors. Experimental results reveal that our method outperforms state-of-the-art (SOTA) performances relative to previous SCD efforts.