Playing to the Strengths of High- and Low-Resolution Cues for Ultra-High Resolution Image Segmentation

IF 5.3 2区计算机科学 Q2 ROBOTICS

IEEE Robotics and Automation Letters Pub Date : 2025-06-13 DOI:10.1109/LRA.2025.3579605

Qi Li;Jiexin Luo;Chunxiao Chen;Jiaxin Cai;Wenjie Yang;Yuanlong Yu;Shengfeng He;Wenxi Liu

{"title":"Playing to the Strengths of High- and Low-Resolution Cues for Ultra-High Resolution Image Segmentation","authors":"Qi Li;Jiexin Luo;Chunxiao Chen;Jiaxin Cai;Wenjie Yang;Yuanlong Yu;Shengfeng He;Wenxi Liu","doi":"10.1109/LRA.2025.3579605","DOIUrl":null,"url":null,"abstract":"In ultra-high resolution image segmentation task for robotic platforms like AAVs and autonomous vehicles, existing paradigms process a downsampled input image through a deep network and the original high-resolution image through a shallow network, then fusing their features for final segmentation. Although these features are designed to be complementary, they often contain redundant or even conflicting semantic information, which leads to blurred edge contours, particularly for small objects. This is especially detrimental to robotics applications requiring precise spatial awareness. To address this challenge, we propose a novel paradigm that disentangles the task into two independent subtasks concerning high- and low-resolution inputs, leveraging high-resolution features exclusively to capture low-level structured details and low-resolution features for extracting semantics. Specifically, for the high-resolution input, we propose a region-pixel association experts scheme that partitions the image into multiple regions. For the low-resolution input, we assign compact semantic tokens to the partitioned regions. Additionally, we incorporate a high-resolution local perception scheme with an efficient field-enriched local context module to enhance small object recognition in case of incorrect semantic assignment. Extensive experiments demonstrate the state-of-the-art performance of our method and validate the effectiveness of each designed component.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 8","pages":"7787-7794"},"PeriodicalIF":5.3000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11034711/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

In ultra-high resolution image segmentation task for robotic platforms like AAVs and autonomous vehicles, existing paradigms process a downsampled input image through a deep network and the original high-resolution image through a shallow network, then fusing their features for final segmentation. Although these features are designed to be complementary, they often contain redundant or even conflicting semantic information, which leads to blurred edge contours, particularly for small objects. This is especially detrimental to robotics applications requiring precise spatial awareness. To address this challenge, we propose a novel paradigm that disentangles the task into two independent subtasks concerning high- and low-resolution inputs, leveraging high-resolution features exclusively to capture low-level structured details and low-resolution features for extracting semantics. Specifically, for the high-resolution input, we propose a region-pixel association experts scheme that partitions the image into multiple regions. For the low-resolution input, we assign compact semantic tokens to the partitioned regions. Additionally, we incorporate a high-resolution local perception scheme with an efficient field-enriched local context module to enhance small object recognition in case of incorrect semantic assignment. Extensive experiments demonstrate the state-of-the-art performance of our method and validate the effectiveness of each designed component.

查看原文本刊更多论文

利用高分辨率和低分辨率线索的优势进行超高分辨率图像分割

在aav和自动驾驶汽车等机器人平台的超高分辨率图像分割任务中，现有的范式通过深度网络处理下采样的输入图像，通过浅网络处理原始高分辨率图像，然后融合它们的特征进行最终分割。虽然这些特征被设计为互补的，但它们往往包含冗余甚至冲突的语义信息，这导致边缘轮廓模糊，特别是对于小对象。这对需要精确空间感知的机器人应用尤其有害。为了应对这一挑战，我们提出了一种新的范式，将任务分解为涉及高分辨率和低分辨率输入的两个独立子任务，仅利用高分辨率特征来捕获低级结构化细节，并利用低分辨率特征来提取语义。具体来说，对于高分辨率输入，我们提出了一种区域-像素关联专家方案，将图像划分为多个区域。对于低分辨率输入，我们将紧凑的语义令牌分配到划分的区域。此外，我们结合了一个高分辨率的局部感知方案和一个高效的场丰富的局部上下文模块，以增强在错误语义分配情况下对小物体的识别。大量的实验证明了我们的方法的最先进的性能，并验证了每个设计组件的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Robotics and Automation Letters Computer Science-Computer Science Applications

CiteScore

9.60

自引率

15.40%

发文量

1428

期刊介绍： The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.