基于微尺度感知和增强的小物体检测--位置特征金字塔

IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Guang Han;Chenwei Guo;Ziyang Li;Haitao Zhao
{"title":"基于微尺度感知和增强的小物体检测--位置特征金字塔","authors":"Guang Han;Chenwei Guo;Ziyang Li;Haitao Zhao","doi":"10.1109/TCDS.2024.3397684","DOIUrl":null,"url":null,"abstract":"Due to the large number of small objects, significant scale variation, and uneven distribution in images captured by unmanned aerial vehicles (UAVs), existing algorithms have high rates of missing and false detections of small objects in drone images. A new object detection algorithm based on microscale perception and enhancement-location feature pyramid is proposed in this article. The microscale perception module alternatives the original convolution module in backbone, changing the receptive field through two dilation branches with various dilation rates and an adjustment switch branch. To better match the size and shape of sampled targets, the weighted deformable convolution is employed. The enhancement-location feature pyramid module aggregates the features from each layer to obtain balanced semantic information and refines aggregated features to enhance their ability to represent features. Moreover, a bottom-up branch structure is added to utilize the property of lower layer features being beneficial to locating small objects to enhance the localization ability for small objects. Additionally, by using specific image cropping and combining techniques, the target distribution of the training data is altered to make the model more sensitive to small objects and improving its robustness. Finally, a sample balance strategy is used in combination with focal loss and a sample extraction control method to balance simple hard sample imbalance and the long-tail distribution of interclass sample imbalance during training. Experimental results show that the proposed algorithm achieves a mean average precision of 35.9% on the VisDrone2019 dataset, which is a 14.2% improvement over the baseline Cascade RCNN and demonstrates better performance in detecting small objects in drone images. Compared with advanced algorithms in recent years, it also achieves state-of-the-art detection accuracy.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 6","pages":"1982-1996"},"PeriodicalIF":5.0000,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Small Object Detection Based on Microscale Perception and Enhancement-Location Feature Pyramid\",\"authors\":\"Guang Han;Chenwei Guo;Ziyang Li;Haitao Zhao\",\"doi\":\"10.1109/TCDS.2024.3397684\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the large number of small objects, significant scale variation, and uneven distribution in images captured by unmanned aerial vehicles (UAVs), existing algorithms have high rates of missing and false detections of small objects in drone images. A new object detection algorithm based on microscale perception and enhancement-location feature pyramid is proposed in this article. The microscale perception module alternatives the original convolution module in backbone, changing the receptive field through two dilation branches with various dilation rates and an adjustment switch branch. To better match the size and shape of sampled targets, the weighted deformable convolution is employed. The enhancement-location feature pyramid module aggregates the features from each layer to obtain balanced semantic information and refines aggregated features to enhance their ability to represent features. Moreover, a bottom-up branch structure is added to utilize the property of lower layer features being beneficial to locating small objects to enhance the localization ability for small objects. Additionally, by using specific image cropping and combining techniques, the target distribution of the training data is altered to make the model more sensitive to small objects and improving its robustness. Finally, a sample balance strategy is used in combination with focal loss and a sample extraction control method to balance simple hard sample imbalance and the long-tail distribution of interclass sample imbalance during training. Experimental results show that the proposed algorithm achieves a mean average precision of 35.9% on the VisDrone2019 dataset, which is a 14.2% improvement over the baseline Cascade RCNN and demonstrates better performance in detecting small objects in drone images. Compared with advanced algorithms in recent years, it also achieves state-of-the-art detection accuracy.\",\"PeriodicalId\":54300,\"journal\":{\"name\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"volume\":\"16 6\",\"pages\":\"1982-1996\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10521894/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10521894/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

本文章由计算机程序翻译,如有差异,请以英文原文为准。
Small Object Detection Based on Microscale Perception and Enhancement-Location Feature Pyramid
Due to the large number of small objects, significant scale variation, and uneven distribution in images captured by unmanned aerial vehicles (UAVs), existing algorithms have high rates of missing and false detections of small objects in drone images. A new object detection algorithm based on microscale perception and enhancement-location feature pyramid is proposed in this article. The microscale perception module alternatives the original convolution module in backbone, changing the receptive field through two dilation branches with various dilation rates and an adjustment switch branch. To better match the size and shape of sampled targets, the weighted deformable convolution is employed. The enhancement-location feature pyramid module aggregates the features from each layer to obtain balanced semantic information and refines aggregated features to enhance their ability to represent features. Moreover, a bottom-up branch structure is added to utilize the property of lower layer features being beneficial to locating small objects to enhance the localization ability for small objects. Additionally, by using specific image cropping and combining techniques, the target distribution of the training data is altered to make the model more sensitive to small objects and improving its robustness. Finally, a sample balance strategy is used in combination with focal loss and a sample extraction control method to balance simple hard sample imbalance and the long-tail distribution of interclass sample imbalance during training. Experimental results show that the proposed algorithm achieves a mean average precision of 35.9% on the VisDrone2019 dataset, which is a 14.2% improvement over the baseline Cascade RCNN and demonstrates better performance in detecting small objects in drone images. Compared with advanced algorithms in recent years, it also achieves state-of-the-art detection accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
10.00%
发文量
170
期刊介绍: The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信