利用超像素改进目标检测器和提取区域

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI:10.1109/CVPR.2013.477

Guang Shu, Afshin Dehghan, M. Shah

{"title":"利用超像素改进目标检测器和提取区域","authors":"Guang Shu, Afshin Dehghan, M. Shah","doi":"10.1109/CVPR.2013.477","DOIUrl":null,"url":null,"abstract":"We propose an approach to improve the detection performance of a generic detector when it is applied to a particular video. The performance of offline-trained objects detectors are usually degraded in unconstrained video environments due to variant illuminations, backgrounds and camera viewpoints. Moreover, most object detectors are trained using Haar-like features or gradient features but ignore video specific features like consistent color patterns. In our approach, we apply a Super pixel-based Bag-of-Words (BoW) model to iteratively refine the output of a generic detector. Compared to other related work, our method builds a video-specific detector using super pixels, hence it can handle the problem of appearance variation. Most importantly, using Conditional Random Field (CRF) along with our super pixel-based BoW model, we develop and algorithm to segment the object from the background. Therefore our method generates an output of the exact object regions instead of the bounding boxes generated by most detectors. In general, our method takes detection bounding boxes of a generic detector as input and generates the detection output with higher average precision and precise object regions. The experiments on four recent datasets demonstrate the effectiveness of our approach and significantly improves the state-of-art detector by 5-16% in average precision.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"45 1","pages":"3721-3727"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"87","resultStr":"{\"title\":\"Improving an Object Detector and Extracting Regions Using Superpixels\",\"authors\":\"Guang Shu, Afshin Dehghan, M. Shah\",\"doi\":\"10.1109/CVPR.2013.477\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose an approach to improve the detection performance of a generic detector when it is applied to a particular video. The performance of offline-trained objects detectors are usually degraded in unconstrained video environments due to variant illuminations, backgrounds and camera viewpoints. Moreover, most object detectors are trained using Haar-like features or gradient features but ignore video specific features like consistent color patterns. In our approach, we apply a Super pixel-based Bag-of-Words (BoW) model to iteratively refine the output of a generic detector. Compared to other related work, our method builds a video-specific detector using super pixels, hence it can handle the problem of appearance variation. Most importantly, using Conditional Random Field (CRF) along with our super pixel-based BoW model, we develop and algorithm to segment the object from the background. Therefore our method generates an output of the exact object regions instead of the bounding boxes generated by most detectors. In general, our method takes detection bounding boxes of a generic detector as input and generates the detection output with higher average precision and precise object regions. The experiments on four recent datasets demonstrate the effectiveness of our approach and significantly improves the state-of-art detector by 5-16% in average precision.\",\"PeriodicalId\":6343,\"journal\":{\"name\":\"2013 IEEE Conference on Computer Vision and Pattern Recognition\",\"volume\":\"45 1\",\"pages\":\"3721-3727\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"87\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2013.477\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2013.477","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 87

摘要

我们提出了一种方法来提高通用检测器在应用于特定视频时的检测性能。在无约束的视频环境中，由于不同的光照、背景和摄像机视点，离线训练的目标检测器的性能通常会下降。此外，大多数目标检测器都是使用haar特征或梯度特征训练的，但忽略了视频特定的特征，如一致的颜色模式。在我们的方法中，我们应用基于超级像素的词袋(BoW)模型来迭代地改进通用检测器的输出。与其他相关工作相比，我们的方法使用超像素构建视频专用检测器，因此可以处理外观变化问题。最重要的是，我们利用条件随机场(CRF)和基于超像素的BoW模型，开发了一种从背景中分割物体的算法。因此，我们的方法生成的输出是精确的目标区域，而不是大多数检测器生成的边界框。通常，我们的方法以通用检测器的检测边界框为输入，生成具有较高平均精度和精确目标区域的检测输出。在四个最新数据集上的实验证明了我们的方法的有效性，并显着提高了目前最先进的检测器的平均精度5-16%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving an Object Detector and Extracting Regions Using Superpixels

We propose an approach to improve the detection performance of a generic detector when it is applied to a particular video. The performance of offline-trained objects detectors are usually degraded in unconstrained video environments due to variant illuminations, backgrounds and camera viewpoints. Moreover, most object detectors are trained using Haar-like features or gradient features but ignore video specific features like consistent color patterns. In our approach, we apply a Super pixel-based Bag-of-Words (BoW) model to iteratively refine the output of a generic detector. Compared to other related work, our method builds a video-specific detector using super pixels, hence it can handle the problem of appearance variation. Most importantly, using Conditional Random Field (CRF) along with our super pixel-based BoW model, we develop and algorithm to segment the object from the background. Therefore our method generates an output of the exact object regions instead of the bounding boxes generated by most detectors. In general, our method takes detection bounding boxes of a generic detector as input and generates the detection output with higher average precision and precise object regions. The experiments on four recent datasets demonstrate the effectiveness of our approach and significantly improves the state-of-art detector by 5-16% in average precision.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量