RoGLSNet：基于旋转位置嵌入的高效全局-局部场景感知网络

IF 4.4

IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society Pub Date : 2025-09-09 DOI:10.1109/LGRS.2025.3607840

Xiaosheng Yu;Weiqi Bai;Jubo Chen;Jiawei Huang;Zhuoqun Fang;Zhaokui Li

{"title":"RoGLSNet：基于旋转位置嵌入的高效全局-局部场景感知网络","authors":"Xiaosheng Yu;Weiqi Bai;Jubo Chen;Jiawei Huang;Zhuoqun Fang;Zhaokui Li","doi":"10.1109/LGRS.2025.3607840","DOIUrl":null,"url":null,"abstract":"Accurate segmentation of very high-resolution remote sensing images is vital for downstream tasks. Most semantic segmentation methods fail to fully consider the inherent characteristics of the images, such as intricate backgrounds, significant intraclass variance, and spatial interdependence of geographic object distribution. To address these challenges, we propose an efficient global–local scene awareness network with rotary position embedding (RoGLSNet). Specifically, we introduce the dynamic global filter (DGF) module to adaptively select frequency components, thereby mitigating interference from background noise. For high intraclass variance, the class center aware block (CCAB) performs class-level contextual modeling with spatial information integration. Additionally, the rotary position embedding (RoPE) is incorporated into vanilla attention to indirectly model the positional and distance relationships of geographic target objects. Extensive experimental results on two widely used datasets demonstrate that RoGLSNet outperforms the state-of-the-art (SOTA) segmentation methods. The code is available at <uri>https://github.com/bai101315/RoGLSNet</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RoGLSNet: An Efficient Global–Local Scene Awareness Network With Rotary Position Embedding for Remote Image Segmentation\",\"authors\":\"Xiaosheng Yu;Weiqi Bai;Jubo Chen;Jiawei Huang;Zhuoqun Fang;Zhaokui Li\",\"doi\":\"10.1109/LGRS.2025.3607840\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate segmentation of very high-resolution remote sensing images is vital for downstream tasks. Most semantic segmentation methods fail to fully consider the inherent characteristics of the images, such as intricate backgrounds, significant intraclass variance, and spatial interdependence of geographic object distribution. To address these challenges, we propose an efficient global–local scene awareness network with rotary position embedding (RoGLSNet). Specifically, we introduce the dynamic global filter (DGF) module to adaptively select frequency components, thereby mitigating interference from background noise. For high intraclass variance, the class center aware block (CCAB) performs class-level contextual modeling with spatial information integration. Additionally, the rotary position embedding (RoPE) is incorporated into vanilla attention to indirectly model the positional and distance relationships of geographic target objects. Extensive experimental results on two widely used datasets demonstrate that RoGLSNet outperforms the state-of-the-art (SOTA) segmentation methods. The code is available at <uri>https://github.com/bai101315/RoGLSNet</uri>\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11154049/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11154049/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

高分辨率遥感图像的准确分割对后续任务至关重要。大多数语义分割方法没有充分考虑图像的内在特征，如复杂的背景、显著的类内方差、地理对象分布的空间依赖性等。为了解决这些挑战，我们提出了一种高效的基于旋转位置嵌入的全局-局部场景感知网络（RoGLSNet）。具体来说，我们引入了动态全局滤波器（DGF）模块来自适应地选择频率分量，从而减轻背景噪声的干扰。对于类内方差较大的情况，类中心感知块（CCAB）通过空间信息集成实现类级上下文建模。此外，将旋转位置嵌入（RoPE）引入到vanilla attention中，间接建模地理目标对象的位置和距离关系。在两个广泛使用的数据集上的大量实验结果表明，RoGLSNet优于最先进的（SOTA）分割方法。代码可在https://github.com/bai101315/RoGLSNet上获得

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

RoGLSNet: An Efficient Global–Local Scene Awareness Network With Rotary Position Embedding for Remote Image Segmentation

Accurate segmentation of very high-resolution remote sensing images is vital for downstream tasks. Most semantic segmentation methods fail to fully consider the inherent characteristics of the images, such as intricate backgrounds, significant intraclass variance, and spatial interdependence of geographic object distribution. To address these challenges, we propose an efficient global–local scene awareness network with rotary position embedding (RoGLSNet). Specifically, we introduce the dynamic global filter (DGF) module to adaptively select frequency components, thereby mitigating interference from background noise. For high intraclass variance, the class center aware block (CCAB) performs class-level contextual modeling with spatial information integration. Additionally, the rotary position embedding (RoPE) is incorporated into vanilla attention to indirectly model the positional and distance relationships of geographic target objects. Extensive experimental results on two widely used datasets demonstrate that RoGLSNet outperforms the state-of-the-art (SOTA) segmentation methods. The code is available at https://github.com/bai101315/RoGLSNet

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society

自引率

0.00%

发文量