用于低空无人机视角物体检测的分层主动学习技术

IF 11.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Computer Vision Pub Date : 2024-09-15 DOI:10.1007/s11263-024-02228-y

Haohao Hu, Tianyu Han, Yuerong Wang, Wanjun Zhong, Jingwei Yue, Peng Zan

{"title":"用于低空无人机视角物体检测的分层主动学习技术","authors":"Haohao Hu, Tianyu Han, Yuerong Wang, Wanjun Zhong, Jingwei Yue, Peng Zan","doi":"10.1007/s11263-024-02228-y","DOIUrl":null,"url":null,"abstract":"<p>Various object detection techniques are employed on drone platforms. However, the task of annotating drone-view samples is both time-consuming and laborious. This is primarily due to the presence of numerous small-sized instances to be labeled in the drone-view image. To tackle this issue, we propose HALD, a hierarchical active learning approach for low-altitude drone-view object detection. HALD extracts unlabeled image information sequentially from different levels, including point, box, image, and class, aiming to obtain a reliable indicator of image information. The point-level module is utilized to ascertain the valid count and location of instances, while the box-level module screens out reliable predictions. The image-level module selects candidate samples by calculating the consistency of valid boxes within an image, and the class-level module selects the final selected samples based on the distribution of candidate and labeled samples across different classes. Extensive experiments conducted on the VisDrone and CityPersons datasets demonstrate that HALD outperforms several other baseline methods. Additionally, we provide an in-depth analysis of each proposed module. The results show that the performance of evaluating the informativeness of samples can be effectively improved by the four hierarchical levels.\n</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"34 1","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hierarchical Active Learning for Low-Altitude Drone-View Object Detection\",\"authors\":\"Haohao Hu, Tianyu Han, Yuerong Wang, Wanjun Zhong, Jingwei Yue, Peng Zan\",\"doi\":\"10.1007/s11263-024-02228-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Various object detection techniques are employed on drone platforms. However, the task of annotating drone-view samples is both time-consuming and laborious. This is primarily due to the presence of numerous small-sized instances to be labeled in the drone-view image. To tackle this issue, we propose HALD, a hierarchical active learning approach for low-altitude drone-view object detection. HALD extracts unlabeled image information sequentially from different levels, including point, box, image, and class, aiming to obtain a reliable indicator of image information. The point-level module is utilized to ascertain the valid count and location of instances, while the box-level module screens out reliable predictions. The image-level module selects candidate samples by calculating the consistency of valid boxes within an image, and the class-level module selects the final selected samples based on the distribution of candidate and labeled samples across different classes. Extensive experiments conducted on the VisDrone and CityPersons datasets demonstrate that HALD outperforms several other baseline methods. Additionally, we provide an in-depth analysis of each proposed module. The results show that the performance of evaluating the informativeness of samples can be effectively improved by the four hierarchical levels.\\n</p>\",\"PeriodicalId\":13752,\"journal\":{\"name\":\"International Journal of Computer Vision\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":11.6000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Vision\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11263-024-02228-y\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-024-02228-y","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

无人机平台采用了多种物体检测技术。然而，对无人机视图样本进行标注既费时又费力。这主要是由于无人机视图中存在大量需要标注的小尺寸实例。为了解决这个问题，我们提出了一种用于低空无人机视图物体检测的分层主动学习方法--HALD。HALD 从点、盒、图像和类等不同层次依次提取未标记的图像信息，旨在获得可靠的图像信息指标。点级模块用于确定有效实例的数量和位置，而盒级模块则筛选出可靠的预测。图像级模块通过计算图像中有效方框的一致性来选择候选样本，而类别级模块则根据候选样本和标记样本在不同类别中的分布情况来选择最终选定的样本。在 VisDrone 和 CityPersons 数据集上进行的大量实验表明，HALD 优于其他几种基准方法。此外，我们还对提出的每个模块进行了深入分析。结果表明，通过四个层次结构可以有效提高样本信息度的评估性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Hierarchical Active Learning for Low-Altitude Drone-View Object Detection

查看原文本刊更多论文

Hierarchical Active Learning for Low-Altitude Drone-View Object Detection

Various object detection techniques are employed on drone platforms. However, the task of annotating drone-view samples is both time-consuming and laborious. This is primarily due to the presence of numerous small-sized instances to be labeled in the drone-view image. To tackle this issue, we propose HALD, a hierarchical active learning approach for low-altitude drone-view object detection. HALD extracts unlabeled image information sequentially from different levels, including point, box, image, and class, aiming to obtain a reliable indicator of image information. The point-level module is utilized to ascertain the valid count and location of instances, while the box-level module screens out reliable predictions. The image-level module selects candidate samples by calculating the consistency of valid boxes within an image, and the class-level module selects the final selected samples based on the distribution of candidate and labeled samples across different classes. Extensive experiments conducted on the VisDrone and CityPersons datasets demonstrate that HALD outperforms several other baseline methods. Additionally, we provide an in-depth analysis of each proposed module. The results show that the performance of evaluating the informativeness of samples can be effectively improved by the four hierarchical levels.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Computer Vision 工程技术-计算机：人工智能

CiteScore

29.80

自引率

2.10%

发文量

163

审稿时长

6 months

期刊介绍： The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs. Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision. Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community. Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas. In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives. The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research. Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.