{"title":"Content-based human actions retrieval by a novel low complex action representation","authors":"Mohsen Ramezani, F. Yaghmaee","doi":"10.1109/ICCKE.2014.6993466","DOIUrl":null,"url":null,"abstract":"Fast growth of multimedia data (e.g. videos) on the web makes some challenges on regular searching methods. To this end, Content-Based Video Retrieval (CBVR) was introduced as a considerable research interest for managing the collected videos' search on the Internet. Furthermore, due to relating most of these videos to humans, human action retrieval is considered as a new topic in CBVR. In this paper, we seek to improve the accuracy of state-of-the-art CBVR retrieval algorithms with minor computational cost. In this method, local feature points of each video are extracted and the moving directions and scales of the included action are calculated using the points' gradient. The point's gradients on different axis are concatenated into a vector to represent the point. Then, each video's vectors are grouped into four clusters which their centers are considered as the main directions and scales for an action. Moreover, dissimilarity of two videos is calculated by utilizing a novel fuzzy distance measure between their group centers. The experimental results on the most used UCF YouTube dataset with 11 action categories illustrated that, in contrast to the Bag-of-Words model, our method can perform better with less computational cost.","PeriodicalId":152540,"journal":{"name":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2014.6993466","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Fast growth of multimedia data (e.g. videos) on the web makes some challenges on regular searching methods. To this end, Content-Based Video Retrieval (CBVR) was introduced as a considerable research interest for managing the collected videos' search on the Internet. Furthermore, due to relating most of these videos to humans, human action retrieval is considered as a new topic in CBVR. In this paper, we seek to improve the accuracy of state-of-the-art CBVR retrieval algorithms with minor computational cost. In this method, local feature points of each video are extracted and the moving directions and scales of the included action are calculated using the points' gradient. The point's gradients on different axis are concatenated into a vector to represent the point. Then, each video's vectors are grouped into four clusters which their centers are considered as the main directions and scales for an action. Moreover, dissimilarity of two videos is calculated by utilizing a novel fuzzy distance measure between their group centers. The experimental results on the most used UCF YouTube dataset with 11 action categories illustrated that, in contrast to the Bag-of-Words model, our method can perform better with less computational cost.
网络上多媒体数据(如视频)的快速增长对传统的搜索方法提出了挑战。基于内容的视频检索(Content-Based Video Retrieval, CBVR)是目前研究的热点之一。此外,由于这些视频大多与人类有关,因此人类动作检索被认为是CBVR的一个新课题。在本文中,我们寻求以较小的计算成本提高最先进的CBVR检索算法的准确性。该方法提取每个视频的局部特征点,利用特征点的梯度计算动作的运动方向和尺度。点在不同轴上的梯度被连接成一个向量来表示点。然后,每个视频的向量被分成四个簇,它们的中心被认为是一个动作的主要方向和尺度。此外,利用一种新颖的模糊距离度量来计算两个视频的不相似度。在最常用的包含11个动作类别的UCF YouTube数据集上的实验结果表明,与Bag-of-Words模型相比,我们的方法可以以更少的计算成本实现更好的性能。