A Novel Dense Object Detector With Scale Balanced Sample Assignment and Refinement

IF 11.1 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2025-03-18 DOI:10.1109/TCSVT.2025.3551912

Jinpeng Dong;Dingyi Yao;Yufeng Hu;Sanping Zhou;Nanning Zheng

{"title":"A Novel Dense Object Detector With Scale Balanced Sample Assignment and Refinement","authors":"Jinpeng Dong;Dingyi Yao;Yufeng Hu;Sanping Zhou;Nanning Zheng","doi":"10.1109/TCSVT.2025.3551912","DOIUrl":null,"url":null,"abstract":"Scale variation of objects remains one of the crucial challenges in object detection. Currently, conventional dense detectors with fixed receptive fields and label weights are not conducive to the detection of multi-scale objects. However, the design limitations of unbalanced label weights and fixed refinement for multi-scale objects and multi-tasks in these studies make it difficult to achieve better detection performance. In this paper, we propose a novel dense detector named Balanced FCOS which consists of two components: Balanced Label Assignment (BLA) and Flexible Shape-based Refinement (FSR). The BLA implements scale-balanced sample assignment by introducing reweighting factors consisting of localization and classification scores into the label assignment. Low-quality but high-weight samples can be weakened by the BLA. Furthermore, we design a cross-reweighting mechanism in the BLA to ensure score consistency between classification and localization. The FSR implements scale-balanced sample refinement by learning flexible sample points’ offsets for multi-scale objects and multi-tasks based on objects’ coarse features to get more discriminative features with appropriate receptive field. In addition, better features obtained by FSR are beneficial to get better classification and localization scores, which can be used by BLA to produce accurate label weights. Only equipped with the BLA, we can achieve 41.7/46.6 AP under R50/R101-FCOS without any additional parameters. When combining the BLA with the FSR, our Balanced FCOS achieves SOTA results among dense detectors on the COCO test-dev set. Experiments conducted on other heads (T-Head, DyHead), detectors (DINO), and datasets (AI-TOD) further demonstrate the effectiveness of our method.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"9337-9350"},"PeriodicalIF":11.1000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10929026/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Scale variation of objects remains one of the crucial challenges in object detection. Currently, conventional dense detectors with fixed receptive fields and label weights are not conducive to the detection of multi-scale objects. However, the design limitations of unbalanced label weights and fixed refinement for multi-scale objects and multi-tasks in these studies make it difficult to achieve better detection performance. In this paper, we propose a novel dense detector named Balanced FCOS which consists of two components: Balanced Label Assignment (BLA) and Flexible Shape-based Refinement (FSR). The BLA implements scale-balanced sample assignment by introducing reweighting factors consisting of localization and classification scores into the label assignment. Low-quality but high-weight samples can be weakened by the BLA. Furthermore, we design a cross-reweighting mechanism in the BLA to ensure score consistency between classification and localization. The FSR implements scale-balanced sample refinement by learning flexible sample points’ offsets for multi-scale objects and multi-tasks based on objects’ coarse features to get more discriminative features with appropriate receptive field. In addition, better features obtained by FSR are beneficial to get better classification and localization scores, which can be used by BLA to produce accurate label weights. Only equipped with the BLA, we can achieve 41.7/46.6 AP under R50/R101-FCOS without any additional parameters. When combining the BLA with the FSR, our Balanced FCOS achieves SOTA results among dense detectors on the COCO test-dev set. Experiments conducted on other heads (T-Head, DyHead), detectors (DINO), and datasets (AI-TOD) further demonstrate the effectiveness of our method.

查看原文本刊更多论文

一种新型的尺度平衡样本分配与细化的密集目标检测器

目标的尺度变化是目标检测的关键问题之一。目前，具有固定感受野和标签权值的传统密集检测器不利于多尺度目标的检测。然而，在这些研究中，针对多尺度目标和多任务，存在标签权不平衡和固定细化的设计限制，难以获得更好的检测性能。本文提出了一种新型的密度检测器平衡FCOS，该检测器由平衡标签分配（BLA）和柔性形状优化（FSR）两部分组成。BLA通过在标签分配中引入由定位分数和分类分数组成的重加权因子来实现尺度平衡的样本分配。低质量但高重量的样品可以被BLA削弱。此外，我们在BLA中设计了一个交叉重权机制，以确保分类和定位之间的分数一致性。FSR基于对象的粗糙特征，通过学习多尺度对象和多任务的灵活样本点偏移量，实现尺度平衡样本的细化，得到更多具有合适接受域的判别特征。此外，FSR获得的更好的特征有利于得到更好的分类和定位分数，BLA可以利用这些分数来产生准确的标签权值。仅配备BLA，我们可以在R50/R101-FCOS下实现41.7/46.6 AP，无需任何额外参数。当BLA与FSR相结合时，我们的平衡FCOS在COCO测试开发集上实现了密集探测器之间的SOTA结果。在其他头部（T-Head, DyHead），检测器（DINO）和数据集（AI-TOD）上进行的实验进一步证明了我们的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Circuits and Systems for Video Technology 工程技术-工程：电子与电气

CiteScore

13.80

自引率

27.40%

发文量

660

审稿时长

5 months

期刊介绍： The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.