以实体为中心的复杂事件检测特征池

HuEvent '14 Pub Date : 2014-11-07 DOI:10.1145/2660505.2660506

Ishani Chakraborty, Hui Cheng, O. Javed

{"title":"以实体为中心的复杂事件检测特征池","authors":"Ishani Chakraborty, Hui Cheng, O. Javed","doi":"10.1145/2660505.2660506","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an entity centric region of interest detection and visual-semantic pooling scheme for complex event detection in YouTube-like videos. Our method is based on the hypothesis that many YouTube-like videos involve people interacting with each other and objects in their vicinity. Based on this hypothesis, we first discover an Area of Interest (AoI) map in image keyframes and then use the AoI map for localized pooling of features. The AoI map is derived from image based saliency cues weighted by the actionable space of the person involved in the event. We extract the actionable space of the person based on human position and gaze based attention allocated per region. Based on the AoI map, we divide the image into disparate regions, pool features separately from each region and finally combine them into a single image signature. To this end, we show that our proposed semantically pooled image signature contains discriminative information that detects visual events favorably as compared to state of the art approaches.","PeriodicalId":434817,"journal":{"name":"HuEvent '14","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Entity centric Feature Pooling for Complex Event Detection\",\"authors\":\"Ishani Chakraborty, Hui Cheng, O. Javed\",\"doi\":\"10.1145/2660505.2660506\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose an entity centric region of interest detection and visual-semantic pooling scheme for complex event detection in YouTube-like videos. Our method is based on the hypothesis that many YouTube-like videos involve people interacting with each other and objects in their vicinity. Based on this hypothesis, we first discover an Area of Interest (AoI) map in image keyframes and then use the AoI map for localized pooling of features. The AoI map is derived from image based saliency cues weighted by the actionable space of the person involved in the event. We extract the actionable space of the person based on human position and gaze based attention allocated per region. Based on the AoI map, we divide the image into disparate regions, pool features separately from each region and finally combine them into a single image signature. To this end, we show that our proposed semantically pooled image signature contains discriminative information that detects visual events favorably as compared to state of the art approaches.\",\"PeriodicalId\":434817,\"journal\":{\"name\":\"HuEvent '14\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"HuEvent '14\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2660505.2660506\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"HuEvent '14","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2660505.2660506","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

在本文中，我们提出了一种以实体为中心的兴趣区域检测和视觉语义池方案，用于类youtube视频的复杂事件检测。我们的方法是基于这样一个假设，即许多类似youtube的视频都涉及到人们之间的互动以及他们附近的物体。基于这一假设，我们首先在图像关键帧中发现感兴趣区域(AoI)映射，然后使用AoI映射进行特征的局部池化。AoI地图是基于图像的显著性线索，由事件中涉及的人的可操作空间加权得出的。我们根据人的位置提取人的可操作空间，并根据每个区域分配的凝视注意力提取人的可操作空间。在AoI图的基础上，我们将图像划分为不同的区域，从每个区域中单独提取特征，最后将它们合并成单个图像签名。为此，我们表明，与目前的方法相比，我们提出的语义池图像签名包含鉴别信息，可以更好地检测视觉事件。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Entity centric Feature Pooling for Complex Event Detection

In this paper, we propose an entity centric region of interest detection and visual-semantic pooling scheme for complex event detection in YouTube-like videos. Our method is based on the hypothesis that many YouTube-like videos involve people interacting with each other and objects in their vicinity. Based on this hypothesis, we first discover an Area of Interest (AoI) map in image keyframes and then use the AoI map for localized pooling of features. The AoI map is derived from image based saliency cues weighted by the actionable space of the person involved in the event. We extract the actionable space of the person based on human position and gaze based attention allocated per region. Based on the AoI map, we divide the image into disparate regions, pool features separately from each region and finally combine them into a single image signature. To this end, we show that our proposed semantically pooled image signature contains discriminative information that detects visual events favorably as compared to state of the art approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

HuEvent '14

自引率

0.00%

发文量