{"title":"Clustering-Based Adaptive Query Generation for Semantic Segmentation","authors":"Yeong Woo Kim;Wonjun Kim","doi":"10.1109/LSP.2025.3558160","DOIUrl":null,"url":null,"abstract":"Semantic segmentation is one of the crucial tasks in the field of computer vision, aiming to label each pixel according to its class. Most recently, several semantic segmentation methods, which adopt the transformer decoder with learnable queries, have achieved the impressive improvement. However, since learnable queries are primarily determined by the distribution of training samples, discriminative characteristics of the input image often have been disregarded. In this letter, we propose a novel clustering-based query generation method for semantic segmentation. The key idea of the proposed method is to adaptively generate queries based on the clustering scheme, which leverages semantic affinities in the latent space. By aggregating latent features that represent the same class in a given input, the semantic information of each class can be efficiently encoded into the query. Furthermore, we propose to apply the auxiliary loss function to predict the segmentation result in a coarse scale during the process of query generation. This enables each query to grasp spatial information of the target object in a given image. Experimental results on various benchmarks show that the proposed method effectively improves the performance of semantic segmentation.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1580-1584"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10949765/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Semantic segmentation is one of the crucial tasks in the field of computer vision, aiming to label each pixel according to its class. Most recently, several semantic segmentation methods, which adopt the transformer decoder with learnable queries, have achieved the impressive improvement. However, since learnable queries are primarily determined by the distribution of training samples, discriminative characteristics of the input image often have been disregarded. In this letter, we propose a novel clustering-based query generation method for semantic segmentation. The key idea of the proposed method is to adaptively generate queries based on the clustering scheme, which leverages semantic affinities in the latent space. By aggregating latent features that represent the same class in a given input, the semantic information of each class can be efficiently encoded into the query. Furthermore, we propose to apply the auxiliary loss function to predict the segmentation result in a coarse scale during the process of query generation. This enables each query to grasp spatial information of the target object in a given image. Experimental results on various benchmarks show that the proposed method effectively improves the performance of semantic segmentation.
期刊介绍:
The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.