{"title":"用于广播视频的自动体育视频类型分类","authors":"Yuan Dong, Jiwei Zhang, Xiaofu Chang, Jian Zhao","doi":"10.1109/VCIP.2012.6410850","DOIUrl":null,"url":null,"abstract":"A novel sports genre categorization algorithm based on representative shot extraction and geometry visual phrase(GVP) is presented in this paper. Performance of sports classification can be observably improved by generating reduced image set containing representative information and encoding spatial information into bag-of-words (BOW) model. Firstly, Shots containing significant information of videos are chosen by key-frame clustering. Secondly, GVP are searched by the co-occurrence of visual words in a spatial layout based on scale invariant feature transform (SIFT). Then visual words and GVP are concatenated to form enhanced histograms before SVM based classifying procedure. Compared with most existing methods, our algorithm is domain knowledge free as well as fully automatic and thus provides better extensibility. Experiments on a database of 10 sport genres with over 10257 minutes of videos from different sources achieved an average accuracy of 87.3%, which validates the robustness of our proposed algorithm over large-scale database.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Automatic sports video genre categorization for broadcast videos\",\"authors\":\"Yuan Dong, Jiwei Zhang, Xiaofu Chang, Jian Zhao\",\"doi\":\"10.1109/VCIP.2012.6410850\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A novel sports genre categorization algorithm based on representative shot extraction and geometry visual phrase(GVP) is presented in this paper. Performance of sports classification can be observably improved by generating reduced image set containing representative information and encoding spatial information into bag-of-words (BOW) model. Firstly, Shots containing significant information of videos are chosen by key-frame clustering. Secondly, GVP are searched by the co-occurrence of visual words in a spatial layout based on scale invariant feature transform (SIFT). Then visual words and GVP are concatenated to form enhanced histograms before SVM based classifying procedure. Compared with most existing methods, our algorithm is domain knowledge free as well as fully automatic and thus provides better extensibility. Experiments on a database of 10 sport genres with over 10257 minutes of videos from different sources achieved an average accuracy of 87.3%, which validates the robustness of our proposed algorithm over large-scale database.\",\"PeriodicalId\":103073,\"journal\":{\"name\":\"2012 Visual Communications and Image Processing\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Visual Communications and Image Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP.2012.6410850\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Visual Communications and Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP.2012.6410850","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic sports video genre categorization for broadcast videos
A novel sports genre categorization algorithm based on representative shot extraction and geometry visual phrase(GVP) is presented in this paper. Performance of sports classification can be observably improved by generating reduced image set containing representative information and encoding spatial information into bag-of-words (BOW) model. Firstly, Shots containing significant information of videos are chosen by key-frame clustering. Secondly, GVP are searched by the co-occurrence of visual words in a spatial layout based on scale invariant feature transform (SIFT). Then visual words and GVP are concatenated to form enhanced histograms before SVM based classifying procedure. Compared with most existing methods, our algorithm is domain knowledge free as well as fully automatic and thus provides better extensibility. Experiments on a database of 10 sport genres with over 10257 minutes of videos from different sources achieved an average accuracy of 87.3%, which validates the robustness of our proposed algorithm over large-scale database.