以实体为中心的复杂事件检测特征池

Ishani Chakraborty, Hui Cheng, O. Javed
{"title":"以实体为中心的复杂事件检测特征池","authors":"Ishani Chakraborty, Hui Cheng, O. Javed","doi":"10.1145/2660505.2660506","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an entity centric region of interest detection and visual-semantic pooling scheme for complex event detection in YouTube-like videos. Our method is based on the hypothesis that many YouTube-like videos involve people interacting with each other and objects in their vicinity. Based on this hypothesis, we first discover an Area of Interest (AoI) map in image keyframes and then use the AoI map for localized pooling of features. The AoI map is derived from image based saliency cues weighted by the actionable space of the person involved in the event. We extract the actionable space of the person based on human position and gaze based attention allocated per region. Based on the AoI map, we divide the image into disparate regions, pool features separately from each region and finally combine them into a single image signature. To this end, we show that our proposed semantically pooled image signature contains discriminative information that detects visual events favorably as compared to state of the art approaches.","PeriodicalId":434817,"journal":{"name":"HuEvent '14","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Entity centric Feature Pooling for Complex Event Detection\",\"authors\":\"Ishani Chakraborty, Hui Cheng, O. Javed\",\"doi\":\"10.1145/2660505.2660506\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose an entity centric region of interest detection and visual-semantic pooling scheme for complex event detection in YouTube-like videos. Our method is based on the hypothesis that many YouTube-like videos involve people interacting with each other and objects in their vicinity. Based on this hypothesis, we first discover an Area of Interest (AoI) map in image keyframes and then use the AoI map for localized pooling of features. The AoI map is derived from image based saliency cues weighted by the actionable space of the person involved in the event. We extract the actionable space of the person based on human position and gaze based attention allocated per region. Based on the AoI map, we divide the image into disparate regions, pool features separately from each region and finally combine them into a single image signature. To this end, we show that our proposed semantically pooled image signature contains discriminative information that detects visual events favorably as compared to state of the art approaches.\",\"PeriodicalId\":434817,\"journal\":{\"name\":\"HuEvent '14\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"HuEvent '14\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2660505.2660506\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"HuEvent '14","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2660505.2660506","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

在本文中,我们提出了一种以实体为中心的兴趣区域检测和视觉语义池方案,用于类youtube视频的复杂事件检测。我们的方法是基于这样一个假设,即许多类似youtube的视频都涉及到人们之间的互动以及他们附近的物体。基于这一假设,我们首先在图像关键帧中发现感兴趣区域(AoI)映射,然后使用AoI映射进行特征的局部池化。AoI地图是基于图像的显著性线索,由事件中涉及的人的可操作空间加权得出的。我们根据人的位置提取人的可操作空间,并根据每个区域分配的凝视注意力提取人的可操作空间。在AoI图的基础上,我们将图像划分为不同的区域,从每个区域中单独提取特征,最后将它们合并成单个图像签名。为此,我们表明,与目前的方法相比,我们提出的语义池图像签名包含鉴别信息,可以更好地检测视觉事件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Entity centric Feature Pooling for Complex Event Detection
In this paper, we propose an entity centric region of interest detection and visual-semantic pooling scheme for complex event detection in YouTube-like videos. Our method is based on the hypothesis that many YouTube-like videos involve people interacting with each other and objects in their vicinity. Based on this hypothesis, we first discover an Area of Interest (AoI) map in image keyframes and then use the AoI map for localized pooling of features. The AoI map is derived from image based saliency cues weighted by the actionable space of the person involved in the event. We extract the actionable space of the person based on human position and gaze based attention allocated per region. Based on the AoI map, we divide the image into disparate regions, pool features separately from each region and finally combine them into a single image signature. To this end, we show that our proposed semantically pooled image signature contains discriminative information that detects visual events favorably as compared to state of the art approaches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信