基于区域和回归的深度cnn提高目标检测精度

2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC) Pub Date : 2017-12-01 DOI:10.1109/SPAC.2017.8304297

Liang Qu, Shengke Wang, Na Yang, Long Chen, Lu Liu, Xiaoyan Zhang, Feng Gao, Junyu Dong

{"title":"基于区域和回归的深度cnn提高目标检测精度","authors":"Liang Qu, Shengke Wang, Na Yang, Long Chen, Lu Liu, Xiaoyan Zhang, Feng Gao, Junyu Dong","doi":"10.1109/SPAC.2017.8304297","DOIUrl":null,"url":null,"abstract":"Object detection has made great improvements in convolutional neural networks (CNNs), which is the high-capacity visual model that yields hierarchies of discriminative features. Object detection based on CNNs is in general divided into two aspects: region based detection and regression based detection. In this paper, we aim at further advancing object detection performance by properly utilizing the complementary results of those two streams. By investigating errors of several previous state-of-the-art methods about the two streams, we discover that those detection results of two general streams are complementary in object recognition and localization. Region based methods achieve high recall but simultaneously struggle with localization problems, while regression based methods make less localization errors by iteratively regressing the object to target localization. Driven by these observations, we propose two kinds of fusion paradigms to combine the results of those two streams. One is direct fusion utilizing the complementary results of those two streams and adopting non-maximal suppression (NMS) and voting operation to make full use of the results generated by two streams. In addition, considering direct fusion may compromise the original performance of object detections, we also propose another method, modifies voting operation that just refines the box coordinate without having any other impact on the original detections and further boosts the performance by an adding operation. Extensive experiments show that our two ensemble paradigms both boost the state-of-the-art results on Pascal VOC dataset.","PeriodicalId":161647,"journal":{"name":"2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Improving object detection accuracy with region and regression based deep CNNs\",\"authors\":\"Liang Qu, Shengke Wang, Na Yang, Long Chen, Lu Liu, Xiaoyan Zhang, Feng Gao, Junyu Dong\",\"doi\":\"10.1109/SPAC.2017.8304297\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Object detection has made great improvements in convolutional neural networks (CNNs), which is the high-capacity visual model that yields hierarchies of discriminative features. Object detection based on CNNs is in general divided into two aspects: region based detection and regression based detection. In this paper, we aim at further advancing object detection performance by properly utilizing the complementary results of those two streams. By investigating errors of several previous state-of-the-art methods about the two streams, we discover that those detection results of two general streams are complementary in object recognition and localization. Region based methods achieve high recall but simultaneously struggle with localization problems, while regression based methods make less localization errors by iteratively regressing the object to target localization. Driven by these observations, we propose two kinds of fusion paradigms to combine the results of those two streams. One is direct fusion utilizing the complementary results of those two streams and adopting non-maximal suppression (NMS) and voting operation to make full use of the results generated by two streams. In addition, considering direct fusion may compromise the original performance of object detections, we also propose another method, modifies voting operation that just refines the box coordinate without having any other impact on the original detections and further boosts the performance by an adding operation. Extensive experiments show that our two ensemble paradigms both boost the state-of-the-art results on Pascal VOC dataset.\",\"PeriodicalId\":161647,\"journal\":{\"name\":\"2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPAC.2017.8304297\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPAC.2017.8304297","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

卷积神经网络(cnn)是一种产生判别特征层次的高容量视觉模型，在目标检测方面取得了很大的进步。基于cnn的目标检测一般分为两个方面:基于区域的检测和基于回归的检测。在本文中，我们的目标是通过适当地利用这两个流的互补结果来进一步提高目标检测性能。通过对两种流的误差分析，我们发现两种流的检测结果在目标识别和定位上是互补的。基于区域的方法具有较高的召回率，但同时也存在定位问题，而基于回归的方法通过迭代地将目标回归到目标定位，从而减少了定位误差。在这些观察的驱动下，我们提出了两种融合范式来结合这两种流的结果。一种是利用两流的互补结果直接融合，采用非最大抑制(NMS)和投票操作，充分利用两流产生的结果。此外，考虑到直接融合可能会影响目标检测的原始性能，我们还提出了另一种方法，修改了投票操作，该操作只对盒坐标进行细化，对原始检测没有任何其他影响，并通过加法操作进一步提高了性能。大量的实验表明，我们的两种集成范式都提高了Pascal VOC数据集上的最新结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving object detection accuracy with region and regression based deep CNNs

Object detection has made great improvements in convolutional neural networks (CNNs), which is the high-capacity visual model that yields hierarchies of discriminative features. Object detection based on CNNs is in general divided into two aspects: region based detection and regression based detection. In this paper, we aim at further advancing object detection performance by properly utilizing the complementary results of those two streams. By investigating errors of several previous state-of-the-art methods about the two streams, we discover that those detection results of two general streams are complementary in object recognition and localization. Region based methods achieve high recall but simultaneously struggle with localization problems, while regression based methods make less localization errors by iteratively regressing the object to target localization. Driven by these observations, we propose two kinds of fusion paradigms to combine the results of those two streams. One is direct fusion utilizing the complementary results of those two streams and adopting non-maximal suppression (NMS) and voting operation to make full use of the results generated by two streams. In addition, considering direct fusion may compromise the original performance of object detections, we also propose another method, modifies voting operation that just refines the box coordinate without having any other impact on the original detections and further boosts the performance by an adding operation. Extensive experiments show that our two ensemble paradigms both boost the state-of-the-art results on Pascal VOC dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)

自引率

0.00%

发文量