基于深度学习方法的目标检测器研究综述

Aiqing Yang, Mingzhe Liu, Jianping Wang, Qiulin He, Nan Zhang, Jiaru Jia
{"title":"基于深度学习方法的目标检测器研究综述","authors":"Aiqing Yang, Mingzhe Liu, Jianping Wang, Qiulin He, Nan Zhang, Jiaru Jia","doi":"10.1109/PRAI55851.2022.9904222","DOIUrl":null,"url":null,"abstract":"Over the past ten years, deep learning methods has achieved great progress in the field of computer vision (CV), especially in object detection. In contrast with traditional detection methods, deep learning methods significantly outperform those in real-time performance and accuracy without any complicated hand-crafted process of feature extraction. The powerful ability of convolutional neural networks (CNNs) to extract features was realized as image classification task made breakthroughs by employing it. Therefore, many researchers attempt to apply CNNs which can learn high-level semantic features to object detection, producing some representative models like classical one-stage detectors and two-stage detectors. Besides, in recent years, Transformer which shows extraordinary talents in the field of natural language processing (NLP) has also been utilized in object detection models. A kind of novel detectors has been proposed based on transformer encoder-decoder architecture without usage of anchor generation and non-maximum suppression (NMS) postprocessing, which starts a new detection mode—set prediction, and gains great performance. In this paper, we summarize various detectors from different detection modes. To begin with, our review shows architecture of classical two-stage detectors and one-stage detectors. Then, transformer-based detectors are introduced in detail. Experimental evaluation is also provided to compare performance from various detectors. Finally, we raise a conclusion and future prospects to serve as a guideline for future work.","PeriodicalId":243612,"journal":{"name":"2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Investigations of Object Detectors with Deep Learning Methods: A Review\",\"authors\":\"Aiqing Yang, Mingzhe Liu, Jianping Wang, Qiulin He, Nan Zhang, Jiaru Jia\",\"doi\":\"10.1109/PRAI55851.2022.9904222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the past ten years, deep learning methods has achieved great progress in the field of computer vision (CV), especially in object detection. In contrast with traditional detection methods, deep learning methods significantly outperform those in real-time performance and accuracy without any complicated hand-crafted process of feature extraction. The powerful ability of convolutional neural networks (CNNs) to extract features was realized as image classification task made breakthroughs by employing it. Therefore, many researchers attempt to apply CNNs which can learn high-level semantic features to object detection, producing some representative models like classical one-stage detectors and two-stage detectors. Besides, in recent years, Transformer which shows extraordinary talents in the field of natural language processing (NLP) has also been utilized in object detection models. A kind of novel detectors has been proposed based on transformer encoder-decoder architecture without usage of anchor generation and non-maximum suppression (NMS) postprocessing, which starts a new detection mode—set prediction, and gains great performance. In this paper, we summarize various detectors from different detection modes. To begin with, our review shows architecture of classical two-stage detectors and one-stage detectors. Then, transformer-based detectors are introduced in detail. Experimental evaluation is also provided to compare performance from various detectors. Finally, we raise a conclusion and future prospects to serve as a guideline for future work.\",\"PeriodicalId\":243612,\"journal\":{\"name\":\"2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PRAI55851.2022.9904222\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PRAI55851.2022.9904222","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在过去的十年中,深度学习方法在计算机视觉(CV)领域取得了很大的进展,特别是在目标检测方面。与传统的检测方法相比,深度学习方法在实时性和准确性上都明显优于传统的检测方法,而无需进行复杂的手工特征提取过程。卷积神经网络(convolutional neural network, cnn)提取特征的强大能力得以实现,图像分类任务在利用它的基础上取得了突破。因此,许多研究者尝试将能够学习高级语义特征的cnn应用到目标检测中,产生了经典的一级检测器和二级检测器等具有代表性的模型。此外,近年来,在自然语言处理(NLP)领域表现出非凡天赋的Transformer也被用于目标检测模型。提出了一种基于变压器编解码器结构的新型检测器,不使用锚点生成和非最大抑制(NMS)后处理,开启了一种新的检测模式集预测,并取得了良好的性能。本文总结了不同检测方式下的各种检测器。首先,我们回顾了经典的两级检测器和一级检测器的结构。然后详细介绍了基于变压器的检测器。还提供了实验评估来比较不同检测器的性能。最后,提出结论和展望,为今后的工作提供指导。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Investigations of Object Detectors with Deep Learning Methods: A Review
Over the past ten years, deep learning methods has achieved great progress in the field of computer vision (CV), especially in object detection. In contrast with traditional detection methods, deep learning methods significantly outperform those in real-time performance and accuracy without any complicated hand-crafted process of feature extraction. The powerful ability of convolutional neural networks (CNNs) to extract features was realized as image classification task made breakthroughs by employing it. Therefore, many researchers attempt to apply CNNs which can learn high-level semantic features to object detection, producing some representative models like classical one-stage detectors and two-stage detectors. Besides, in recent years, Transformer which shows extraordinary talents in the field of natural language processing (NLP) has also been utilized in object detection models. A kind of novel detectors has been proposed based on transformer encoder-decoder architecture without usage of anchor generation and non-maximum suppression (NMS) postprocessing, which starts a new detection mode—set prediction, and gains great performance. In this paper, we summarize various detectors from different detection modes. To begin with, our review shows architecture of classical two-stage detectors and one-stage detectors. Then, transformer-based detectors are introduced in detail. Experimental evaluation is also provided to compare performance from various detectors. Finally, we raise a conclusion and future prospects to serve as a guideline for future work.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信