使用字符串核的足球视频动作分类

2009 Seventh International Workshop on Content-Based Multimedia Indexing Pub Date : 2009-06-03 DOI:10.1109/CBMI.2009.10

Lamberto Ballan, M. Bertini, A. Bimbo, G. Serra

{"title":"使用字符串核的足球视频动作分类","authors":"Lamberto Ballan, M. Bertini, A. Bimbo, G. Serra","doi":"10.1109/CBMI.2009.10","DOIUrl":null,"url":null,"abstract":"Action recognition is a crucial task to provide high-level semantic description of the video content, particularly in the case of sports videos. The bag-of-words (BoW) approach has proven to be successful for the categorization of objects and scenes in images, but it's unable to model temporal information between consecutive frames for video event recognition. In this paper, we present an approach to model actions as a sequence of histograms (one for each frame) represented using a traditional bag-of-words model. Actions are so described by a string (phrase) of variable size, depending on the clip's length, where each frame's representation is considered as a character. To compare these strings we use Needlemann-Wunsch distance, a metrics defined in the information theory, that deal with strings of different length. Finally, SVMs with a string kernel that includes this distance are used to perform classification. Experimental results demonstrate the validity of the proposed approach and they show that it outperforms baseline kNN classifiers.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Action Categorization in Soccer Videos Using String Kernels\",\"authors\":\"Lamberto Ballan, M. Bertini, A. Bimbo, G. Serra\",\"doi\":\"10.1109/CBMI.2009.10\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Action recognition is a crucial task to provide high-level semantic description of the video content, particularly in the case of sports videos. The bag-of-words (BoW) approach has proven to be successful for the categorization of objects and scenes in images, but it's unable to model temporal information between consecutive frames for video event recognition. In this paper, we present an approach to model actions as a sequence of histograms (one for each frame) represented using a traditional bag-of-words model. Actions are so described by a string (phrase) of variable size, depending on the clip's length, where each frame's representation is considered as a character. To compare these strings we use Needlemann-Wunsch distance, a metrics defined in the information theory, that deal with strings of different length. Finally, SVMs with a string kernel that includes this distance are used to perform classification. Experimental results demonstrate the validity of the proposed approach and they show that it outperforms baseline kNN classifiers.\",\"PeriodicalId\":417012,\"journal\":{\"name\":\"2009 Seventh International Workshop on Content-Based Multimedia Indexing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Seventh International Workshop on Content-Based Multimedia Indexing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CBMI.2009.10\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMI.2009.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

动作识别是对视频内容进行高级语义描述的一项重要任务，尤其是在体育视频中。词袋(BoW)方法已被证明可以成功地对图像中的物体和场景进行分类，但它无法对视频事件识别中连续帧之间的时间信息进行建模。在本文中，我们提出了一种方法，将动作建模为使用传统词袋模型表示的直方图序列(每帧一个直方图)。动作是由一个可变大小的字符串(短语)来描述的，这取决于剪辑的长度，其中每一帧的表示都被视为一个字符。为了比较这些字符串，我们使用Needlemann-Wunsch距离，这是信息论中定义的一个度量，用于处理不同长度的字符串。最后，使用包含该距离的字符串内核的svm来执行分类。实验结果证明了该方法的有效性，并表明它优于基线kNN分类器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Action Categorization in Soccer Videos Using String Kernels

Action recognition is a crucial task to provide high-level semantic description of the video content, particularly in the case of sports videos. The bag-of-words (BoW) approach has proven to be successful for the categorization of objects and scenes in images, but it's unable to model temporal information between consecutive frames for video event recognition. In this paper, we present an approach to model actions as a sequence of histograms (one for each frame) represented using a traditional bag-of-words model. Actions are so described by a string (phrase) of variable size, depending on the clip's length, where each frame's representation is considered as a character. To compare these strings we use Needlemann-Wunsch distance, a metrics defined in the information theory, that deal with strings of different length. Finally, SVMs with a string kernel that includes this distance are used to perform classification. Experimental results demonstrate the validity of the proposed approach and they show that it outperforms baseline kNN classifiers.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2009 Seventh International Workshop on Content-Based Multimedia Indexing

自引率

0.00%

发文量