Saliency Map Extraction in Human Crowd RGB Data

2019 58th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE) Pub Date : 2019-09-01 DOI:10.23919/SICE.2019.8859898

Minh Tri Nguyen, Prarinya Siritanawan, K. Kotani

引用次数: 1

Abstract

Saliency map in human crowded scene is a prediction of regions which attracts human visual attention. Humans have an ability to analyze the context of visual scene and focus their attention to salient regions in the crowd scene. In this work, we propose a novel convolutional neural network based method for saliency prediction. Unlike classical works on crowd scene using hand-crafted face features, our model extracts deep features using convolutional layers from image classification model and learns the global context using large receptive convolutional layers. Self-attention mechanism is applied to detect the dependency between elements of feature maps. This model overperformed state-of-the-art methods on the saliency in human crowd Eyecrowd dataset.

查看原文本刊更多论文

人群拥挤场景中的显著性地图是对吸引人类视觉注意力的区域的预测。人类有能力分析视觉场景的背景，并将注意力集中在人群场景中的突出区域。在这项工作中，我们提出了一种新的基于卷积神经网络的显著性预测方法。与传统的使用手工制作人脸特征的人群场景研究不同，我们的模型使用卷积层从图像分类模型中提取深度特征，并使用大接受卷积层学习全局上下文。采用自关注机制检测特征映射元素之间的依赖关系。该模型在人类人群数据集的显著性上优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 58th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE)

自引率

0.00%

发文量