Saliency Map Extraction in Human Crowd RGB Data

Minh Tri Nguyen, Prarinya Siritanawan, K. Kotani
{"title":"Saliency Map Extraction in Human Crowd RGB Data","authors":"Minh Tri Nguyen, Prarinya Siritanawan, K. Kotani","doi":"10.23919/SICE.2019.8859898","DOIUrl":null,"url":null,"abstract":"Saliency map in human crowded scene is a prediction of regions which attracts human visual attention. Humans have an ability to analyze the context of visual scene and focus their attention to salient regions in the crowd scene. In this work, we propose a novel convolutional neural network based method for saliency prediction. Unlike classical works on crowd scene using hand-crafted face features, our model extracts deep features using convolutional layers from image classification model and learns the global context using large receptive convolutional layers. Self-attention mechanism is applied to detect the dependency between elements of feature maps. This model overperformed state-of-the-art methods on the saliency in human crowd Eyecrowd dataset.","PeriodicalId":147772,"journal":{"name":"2019 58th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE)","volume":"166 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 58th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/SICE.2019.8859898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Saliency map in human crowded scene is a prediction of regions which attracts human visual attention. Humans have an ability to analyze the context of visual scene and focus their attention to salient regions in the crowd scene. In this work, we propose a novel convolutional neural network based method for saliency prediction. Unlike classical works on crowd scene using hand-crafted face features, our model extracts deep features using convolutional layers from image classification model and learns the global context using large receptive convolutional layers. Self-attention mechanism is applied to detect the dependency between elements of feature maps. This model overperformed state-of-the-art methods on the saliency in human crowd Eyecrowd dataset.
人群拥挤场景中的显著性地图是对吸引人类视觉注意力的区域的预测。人类有能力分析视觉场景的背景,并将注意力集中在人群场景中的突出区域。在这项工作中,我们提出了一种新的基于卷积神经网络的显著性预测方法。与传统的使用手工制作人脸特征的人群场景研究不同,我们的模型使用卷积层从图像分类模型中提取深度特征,并使用大接受卷积层学习全局上下文。采用自关注机制检测特征映射元素之间的依赖关系。该模型在人类人群数据集的显著性上优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信