Towards Size-invariant Salient Object Detection: A Generic Evaluation and Optimization Approach.

IF 18.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-16 DOI:10.1109/tpami.2025.3609882

Shilong Bao,Qianqian Xu,Feiran Li,Boyu Han,Zhiyong Yang,Xiaochun Cao,Qingming Huang

{"title":"Towards Size-invariant Salient Object Detection: A Generic Evaluation and Optimization Approach.","authors":"Shilong Bao,Qianqian Xu,Feiran Li,Boyu Han,Zhiyong Yang,Xiaochun Cao,Qingming Huang","doi":"10.1109/tpami.2025.3609882","DOIUrl":null,"url":null,"abstract":"This paper investigates a fundamental yet underexplored issue in Salient Object Detection (SOD): the size-invariant property for evaluation protocols, particularly in scenarios when multiple salient objects of significantly different sizes appear within a single image. We first present a novel perspective to expose the inherent size sensitivity of existing widely used SOD metrics. Through careful theoretical derivations, we show that the evaluation outcome of an image under current SOD metrics can be essentially decomposed into a sum of several separable terms, with the contribution of each term being directly proportional to its corresponding region size. Consequently, the prediction errors would be dominated by the larger regions, while smaller yet potentially more semantically important objects are often overlooked, leading to biased performance assessments and practical degradation. To address this challenge, a generic Size-Invariant Evaluation (SIEva) framework is proposed. The core idea is to evaluate each separable component individually and then aggregate the results, thereby effectively mitigating the impact of size imbalance across objects. Building upon this, we further develop a dedicated optimization framework (SIOpt), which adheres to the size-invariant principle and significantly enhances the detection of salient objects across a broad range of sizes. Notably, SIOpt is model-agnostic and can be seamlessly integrated with a wide range of SOD backbones. Theoretically, we also present generalization analysis of SOD methods and provide evidence supporting the validity of our new evaluation protocols. Finally, comprehensive experiments speak to the efficacy of our proposed approach.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"64 1","pages":""},"PeriodicalIF":18.6000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tpami.2025.3609882","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This paper investigates a fundamental yet underexplored issue in Salient Object Detection (SOD): the size-invariant property for evaluation protocols, particularly in scenarios when multiple salient objects of significantly different sizes appear within a single image. We first present a novel perspective to expose the inherent size sensitivity of existing widely used SOD metrics. Through careful theoretical derivations, we show that the evaluation outcome of an image under current SOD metrics can be essentially decomposed into a sum of several separable terms, with the contribution of each term being directly proportional to its corresponding region size. Consequently, the prediction errors would be dominated by the larger regions, while smaller yet potentially more semantically important objects are often overlooked, leading to biased performance assessments and practical degradation. To address this challenge, a generic Size-Invariant Evaluation (SIEva) framework is proposed. The core idea is to evaluate each separable component individually and then aggregate the results, thereby effectively mitigating the impact of size imbalance across objects. Building upon this, we further develop a dedicated optimization framework (SIOpt), which adheres to the size-invariant principle and significantly enhances the detection of salient objects across a broad range of sizes. Notably, SIOpt is model-agnostic and can be seamlessly integrated with a wide range of SOD backbones. Theoretically, we also present generalization analysis of SOD methods and provide evidence supporting the validity of our new evaluation protocols. Finally, comprehensive experiments speak to the efficacy of our proposed approach.

查看原文本刊更多论文

面向尺寸不变的显著目标检测：一种通用的评估和优化方法。

本文研究了显著目标检测（SOD）中一个基本但尚未被充分探索的问题：评估协议的尺寸不变性，特别是在单个图像中出现多个显著大小不同的显著目标的情况下。我们首先提出了一个新的视角来揭示现有广泛使用的SOD指标的固有尺寸敏感性。通过仔细的理论推导，我们表明，在当前SOD指标下，图像的评估结果基本上可以分解为几个可分离项的总和，每个项的贡献与其对应的区域大小成正比。因此，预测误差将由较大的区域主导，而较小但可能更语义重要的对象往往被忽略，导致有偏见的性能评估和实际退化。为了解决这一挑战，提出了一个通用的尺寸不变评估（SIEva）框架。其核心思想是单独评估每个可分离组件，然后汇总结果，从而有效地减轻对象之间大小不平衡的影响。在此基础上，我们进一步开发了一个专用的优化框架（SIOpt），它坚持尺寸不变原则，并显着增强了对各种尺寸范围内显著物体的检测。值得注意的是，SIOpt与模型无关，可以与各种SOD主干无缝集成。从理论上讲，我们也提出了SOD方法的泛化分析，并提供证据支持我们的新评估方案的有效性。最后，综合实验证明了所提方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Pattern Analysis and Machine Intelligence 工程技术-工程：电子与电气

CiteScore

28.40

自引率

3.00%

发文量

885

审稿时长

8.5 months

期刊介绍： The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.