Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross B. Girshick, Jeff Donahue, Trevor Darrell, J. Malik
{"title":"Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation","authors":"Ross B. Girshick, Jeff Donahue, Trevor Darrell, J. Malik","doi":"10.1109/CVPR.2014.81","DOIUrl":null,"url":null,"abstract":"Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22687","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2014.81","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22687

Abstract

Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.
丰富的特征层次结构用于准确的目标检测和语义分割
在经典PASCAL VOC数据集上测量的对象检测性能在过去几年中趋于稳定。表现最好的方法是复杂的集成系统,它通常将多个低级图像特征与高级上下文相结合。在本文中,我们提出了一种简单且可扩展的检测算法,相对于之前在VOC 2012上的最佳结果,该算法将平均精度(mAP)提高了30%以上,实现了53.3%的mAP。我们的方法结合了两个关键见解:(1)可以将高容量卷积神经网络(cnn)应用于自下而上的区域建议,以定位和分割对象;(2)当标记训练数据稀缺时,对辅助任务进行监督预训练,然后进行特定领域的微调,可以显著提高性能。由于我们将区域建议与CNN结合在一起,我们称我们的方法为R-CNN:具有CNN特征的区域。我们还展示了一些实验,这些实验提供了对网络学习内容的洞察,揭示了图像特征的丰富层次。完整系统的源代码可从http://www.cs.berkeley.edu/~rbg/rcnn获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信