Dynamic selection of Gaussian samples for object detection on drone images via shape sensing

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Pub Date : 2025-06-24 DOI:10.1016/j.patcog.2025.111978

Yixuan Li , Yulong Xu , Renwu Sun , Pengnian Wu , Meng Zhang

{"title":"Dynamic selection of Gaussian samples for object detection on drone images via shape sensing","authors":"Yixuan Li , Yulong Xu , Renwu Sun , Pengnian Wu , Meng Zhang","doi":"10.1016/j.patcog.2025.111978","DOIUrl":null,"url":null,"abstract":"<div><div>Label assignment (LA) strategy has been extensively studied as a fundamental issue in object detection. However, the drastic scale changes and wide variations in shape (aspect ratio) of objects in drone images result in a sharp performance drop for general LA strategies. To address the above problems, we propose an adaptive Gaussian sample selection strategy for multi-scale objects via shape sensing. Specifically, we first conduct Gaussian modeling for receptive field priors and ground-truth (gt) boxes, ensuring that the non-zero distance metric between any feature point and any ground truth on the whole image is obtained. Subsequently, we theoretically analyze and show that Kullback–Leibler Divergence (KLD) can measure distance according to the characteristics of the object. Taking advantage of this property, we utilize the statistical characteristics of the top-K highest KLD-based matching scores as the positive sample selection threshold for each gt, thereby assigning adequate high-quality samples to multi-scale objects. More importantly, we introduce an adaptive shape-aware strategy that adjusts the sample quantity according to the aspect ratio of objects, guiding the network to balanced learning for multi-scale objects with various shapes. Extensive experiments show that our dynamic shape-aware LA strategy is applicable to a variety of advanced detectors and achieves consistently improved performances on two major benchmarks (i.e., VisDrone and UAVDT), demonstrating the effectiveness of our approach.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111978"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325006387","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Label assignment (LA) strategy has been extensively studied as a fundamental issue in object detection. However, the drastic scale changes and wide variations in shape (aspect ratio) of objects in drone images result in a sharp performance drop for general LA strategies. To address the above problems, we propose an adaptive Gaussian sample selection strategy for multi-scale objects via shape sensing. Specifically, we first conduct Gaussian modeling for receptive field priors and ground-truth (gt) boxes, ensuring that the non-zero distance metric between any feature point and any ground truth on the whole image is obtained. Subsequently, we theoretically analyze and show that Kullback–Leibler Divergence (KLD) can measure distance according to the characteristics of the object. Taking advantage of this property, we utilize the statistical characteristics of the top-K highest KLD-based matching scores as the positive sample selection threshold for each gt, thereby assigning adequate high-quality samples to multi-scale objects. More importantly, we introduce an adaptive shape-aware strategy that adjusts the sample quantity according to the aspect ratio of objects, guiding the network to balanced learning for multi-scale objects with various shapes. Extensive experiments show that our dynamic shape-aware LA strategy is applicable to a variety of advanced detectors and achieves consistently improved performances on two major benchmarks (i.e., VisDrone and UAVDT), demonstrating the effectiveness of our approach.

Abstract Image

查看原文本刊更多论文

基于形状感知的无人机图像高斯样本的动态选择

标签分配（LA）策略作为目标检测中的一个基本问题已经得到了广泛的研究。然而，无人机图像中剧烈的尺度变化和物体形状（长宽比）的广泛变化导致一般LA策略的性能急剧下降。为了解决上述问题，我们提出了一种基于形状感知的多尺度目标自适应高斯样本选择策略。具体而言，我们首先对接收野先验和ground-truth (gt) box进行高斯建模，确保获得整幅图像上任意特征点与任意ground truth之间的非零距离度量。随后，我们从理论上分析并证明了Kullback-Leibler散度（KLD）可以根据目标的特性来测量距离。利用这一特性，我们利用基于kld的top-K最高匹配分数的统计特征作为每个gt的正样本选择阈值，从而为多尺度对象分配足够的高质量样本。更重要的是，我们引入了一种自适应形状感知策略，根据物体的长宽比调整样本数量，引导网络对各种形状的多尺度物体进行平衡学习。大量实验表明，我们的动态形状感知LA策略适用于各种先进的探测器，并在两个主要基准（即VisDrone和UAVDT）上实现了持续改进的性能，证明了我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.