基于局部区域上下文和目标特征融合的人头部检测

2016 IEEE International Conference on Image Processing (ICIP) Pub Date : 2016-09-01 DOI:10.1109/ICIP.2016.7532426

Yule Li, Y. Dou, Xinwang Liu, Teng Li

{"title":"基于局部区域上下文和目标特征融合的人头部检测","authors":"Yule Li, Y. Dou, Xinwang Liu, Teng Li","doi":"10.1109/ICIP.2016.7532426","DOIUrl":null,"url":null,"abstract":"People head detection in crowded scenes is challenging due to the large variability in clothing and appearance, small scales of people, and strong partial occlusions. Traditional bottom-up proposal methods and existing region proposal network approaches suffer from either poor recall or low precision. In this paper, we propose to improve both the recall and precision of head detection of region proposal models by integrating the local head information. In specific, we first use a region proposal network to predict the bounding boxes and corresponding scores of multiple instances in the region. A local head classifier network is then trained to score the bounding box generated from the region proposal model. After that, we propose an adaptive fusion method by optimally combining both the region and local scores to obtain the final score of each candidate bounding box. Furthermore, our fusion models can automatically learn the optimal hyper-parameters from data. Our algorithm achieves superior people head detection performance on the crowded scenes data set, which significantly outperforms several recent state-of-the-art baselines in the literature.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"35 1","pages":"594-598"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Localized region context and object feature fusion for people head detection\",\"authors\":\"Yule Li, Y. Dou, Xinwang Liu, Teng Li\",\"doi\":\"10.1109/ICIP.2016.7532426\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"People head detection in crowded scenes is challenging due to the large variability in clothing and appearance, small scales of people, and strong partial occlusions. Traditional bottom-up proposal methods and existing region proposal network approaches suffer from either poor recall or low precision. In this paper, we propose to improve both the recall and precision of head detection of region proposal models by integrating the local head information. In specific, we first use a region proposal network to predict the bounding boxes and corresponding scores of multiple instances in the region. A local head classifier network is then trained to score the bounding box generated from the region proposal model. After that, we propose an adaptive fusion method by optimally combining both the region and local scores to obtain the final score of each candidate bounding box. Furthermore, our fusion models can automatically learn the optimal hyper-parameters from data. Our algorithm achieves superior people head detection performance on the crowded scenes data set, which significantly outperforms several recent state-of-the-art baselines in the literature.\",\"PeriodicalId\":6521,\"journal\":{\"name\":\"2016 IEEE International Conference on Image Processing (ICIP)\",\"volume\":\"35 1\",\"pages\":\"594-598\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Image Processing (ICIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIP.2016.7532426\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP.2016.7532426","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

在拥挤的场景中，由于服装和外表的巨大变化，人的规模小，以及强烈的局部遮挡，人的头部检测是具有挑战性的。传统的自下而上的提议方法和现有的区域提议网络方法存在查全率差和查准率低的问题。在本文中，我们提出通过整合局部头部信息来提高区域建议模型的头部检测召回率和精度。具体而言，我们首先使用区域建议网络来预测区域内多个实例的边界框和相应的分数。然后训练局部头部分类器网络对区域建议模型生成的边界框进行评分。然后，我们提出了一种自适应融合方法，将区域分数和局部分数最优结合，得到每个候选边界框的最终分数。此外，我们的融合模型可以从数据中自动学习到最优的超参数。我们的算法在拥挤的场景数据集上实现了优越的人员头部检测性能，显著优于文献中最近的几个最先进的基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Localized region context and object feature fusion for people head detection

People head detection in crowded scenes is challenging due to the large variability in clothing and appearance, small scales of people, and strong partial occlusions. Traditional bottom-up proposal methods and existing region proposal network approaches suffer from either poor recall or low precision. In this paper, we propose to improve both the recall and precision of head detection of region proposal models by integrating the local head information. In specific, we first use a region proposal network to predict the bounding boxes and corresponding scores of multiple instances in the region. A local head classifier network is then trained to score the bounding box generated from the region proposal model. After that, we propose an adaptive fusion method by optimally combining both the region and local scores to obtain the final score of each candidate bounding box. Furthermore, our fusion models can automatically learn the optimal hyper-parameters from data. Our algorithm achieves superior people head detection performance on the crowded scenes data set, which significantly outperforms several recent state-of-the-art baselines in the literature.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE International Conference on Image Processing (ICIP)

自引率

0.00%

发文量