Clustering-Based Adaptive Query Generation for Semantic Segmentation

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters Pub Date : 2025-04-08 DOI:10.1109/LSP.2025.3558160

Yeong Woo Kim;Wonjun Kim

{"title":"Clustering-Based Adaptive Query Generation for Semantic Segmentation","authors":"Yeong Woo Kim;Wonjun Kim","doi":"10.1109/LSP.2025.3558160","DOIUrl":null,"url":null,"abstract":"Semantic segmentation is one of the crucial tasks in the field of computer vision, aiming to label each pixel according to its class. Most recently, several semantic segmentation methods, which adopt the transformer decoder with learnable queries, have achieved the impressive improvement. However, since learnable queries are primarily determined by the distribution of training samples, discriminative characteristics of the input image often have been disregarded. In this letter, we propose a novel clustering-based query generation method for semantic segmentation. The key idea of the proposed method is to adaptively generate queries based on the clustering scheme, which leverages semantic affinities in the latent space. By aggregating latent features that represent the same class in a given input, the semantic information of each class can be efficiently encoded into the query. Furthermore, we propose to apply the auxiliary loss function to predict the segmentation result in a coarse scale during the process of query generation. This enables each query to grasp spatial information of the target object in a given image. Experimental results on various benchmarks show that the proposed method effectively improves the performance of semantic segmentation.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1580-1584"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10949765/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Semantic segmentation is one of the crucial tasks in the field of computer vision, aiming to label each pixel according to its class. Most recently, several semantic segmentation methods, which adopt the transformer decoder with learnable queries, have achieved the impressive improvement. However, since learnable queries are primarily determined by the distribution of training samples, discriminative characteristics of the input image often have been disregarded. In this letter, we propose a novel clustering-based query generation method for semantic segmentation. The key idea of the proposed method is to adaptively generate queries based on the clustering scheme, which leverages semantic affinities in the latent space. By aggregating latent features that represent the same class in a given input, the semantic information of each class can be efficiently encoded into the query. Furthermore, we propose to apply the auxiliary loss function to predict the segmentation result in a coarse scale during the process of query generation. This enables each query to grasp spatial information of the target object in a given image. Experimental results on various benchmarks show that the proposed method effectively improves the performance of semantic segmentation.

查看原文本刊更多论文

基于聚类的语义分割自适应查询生成

语义分割是计算机视觉领域的关键任务之一，其目的是根据每个像素的类别标记每个像素。最近，几种采用具有可学习查询的转换器解码器的语义分割方法取得了令人印象深刻的改进。然而，由于可学习查询主要是由训练样本的分布决定的，输入图像的判别特征经常被忽略。在这封信中，我们提出了一种新的基于聚类的语义分割查询生成方法。该方法的核心思想是基于聚类方案自适应生成查询，该方案利用潜在空间中的语义亲和力。通过聚合给定输入中表示同一类的潜在特征，可以有效地将每个类的语义信息编码到查询中。此外，我们提出在查询生成过程中应用辅助损失函数在粗尺度上预测分割结果。这使得每个查询都能在给定图像中掌握目标对象的空间信息。各种基准测试的实验结果表明，该方法有效地提高了语义分割的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Signal Processing Letters 工程技术-工程：电子与电气

CiteScore

7.40

自引率

12.80%

发文量

339

审稿时长

2.8 months

期刊介绍： The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.