Object detection by labeling superpixels

2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2015-06-07 DOI:10.1109/CVPR.2015.7299146

Junjie Yan, Yinan Yu, Xiangyu Zhu, Zhen Lei, S. Li

引用次数: 98

Abstract

Object detection is often conducted by object proposal generation and classification sequentially. This paper handles object detection in a superpixel oriented manner instead of the proposal oriented. Specially, this paper takes object detection as a multi-label superpixel labeling problem by minimizing an energy function. It uses the data cost term to capture the appearance, smooth cost term to encode the spatial context and label cost term to favor compact detection. The data cost is learned through a convolutional neural network and the parameters in the labeling model are learned through a structural SVM. Compared with proposal generation and classification based methods, the proposed superpixel labeling method can naturally detect objects missed by proposal generation step and capture the global image context to infer the overlapping objects. The proposed method shows its advantage in Pascal VOC and ImageNet. Notably, it performs better than the ImageNet ILSVRC2014 winner GoogLeNet (45.0% V.S. 43.9% in mAP) with much shallower and fewer CNNs.

查看原文本刊更多论文

标记超像素的目标检测

目标检测通常是通过目标提议生成和分类顺序进行的。本文采用面向超像素的方法来处理目标检测，而不是面向提议的方法。特别地，本文通过最小化能量函数将目标检测作为一个多标签超像素标记问题。它使用数据代价项来捕获外观，使用平滑代价项来编码空间上下文，使用标签代价项来支持紧凑检测。通过卷积神经网络学习数据代价，通过结构支持向量机学习标记模型中的参数。与基于建议生成和分类的方法相比，本文提出的超像素标记方法可以自然地检测出被建议生成步骤遗漏的目标，并捕获全局图像上下文来推断重叠目标。该方法在Pascal VOC和ImageNet中具有一定的优势。值得注意的是，它比ImageNet ILSVRC2014的获胜者GoogLeNet (45.0% vs . mAP 43.9%)在更浅和更少的cnn上表现得更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量