Yixuan Li , Yulong Xu , Renwu Sun , Pengnian Wu , Meng Zhang
{"title":"Dynamic selection of Gaussian samples for object detection on drone images via shape sensing","authors":"Yixuan Li , Yulong Xu , Renwu Sun , Pengnian Wu , Meng Zhang","doi":"10.1016/j.patcog.2025.111978","DOIUrl":null,"url":null,"abstract":"<div><div>Label assignment (LA) strategy has been extensively studied as a fundamental issue in object detection. However, the drastic scale changes and wide variations in shape (aspect ratio) of objects in drone images result in a sharp performance drop for general LA strategies. To address the above problems, we propose an adaptive Gaussian sample selection strategy for multi-scale objects via shape sensing. Specifically, we first conduct Gaussian modeling for receptive field priors and ground-truth (gt) boxes, ensuring that the non-zero distance metric between any feature point and any ground truth on the whole image is obtained. Subsequently, we theoretically analyze and show that Kullback–Leibler Divergence (KLD) can measure distance according to the characteristics of the object. Taking advantage of this property, we utilize the statistical characteristics of the top-K highest KLD-based matching scores as the positive sample selection threshold for each gt, thereby assigning adequate high-quality samples to multi-scale objects. More importantly, we introduce an adaptive shape-aware strategy that adjusts the sample quantity according to the aspect ratio of objects, guiding the network to balanced learning for multi-scale objects with various shapes. Extensive experiments show that our dynamic shape-aware LA strategy is applicable to a variety of advanced detectors and achieves consistently improved performances on two major benchmarks (i.e., VisDrone and UAVDT), demonstrating the effectiveness of our approach.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111978"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325006387","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Label assignment (LA) strategy has been extensively studied as a fundamental issue in object detection. However, the drastic scale changes and wide variations in shape (aspect ratio) of objects in drone images result in a sharp performance drop for general LA strategies. To address the above problems, we propose an adaptive Gaussian sample selection strategy for multi-scale objects via shape sensing. Specifically, we first conduct Gaussian modeling for receptive field priors and ground-truth (gt) boxes, ensuring that the non-zero distance metric between any feature point and any ground truth on the whole image is obtained. Subsequently, we theoretically analyze and show that Kullback–Leibler Divergence (KLD) can measure distance according to the characteristics of the object. Taking advantage of this property, we utilize the statistical characteristics of the top-K highest KLD-based matching scores as the positive sample selection threshold for each gt, thereby assigning adequate high-quality samples to multi-scale objects. More importantly, we introduce an adaptive shape-aware strategy that adjusts the sample quantity according to the aspect ratio of objects, guiding the network to balanced learning for multi-scale objects with various shapes. Extensive experiments show that our dynamic shape-aware LA strategy is applicable to a variety of advanced detectors and achieves consistently improved performances on two major benchmarks (i.e., VisDrone and UAVDT), demonstrating the effectiveness of our approach.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.