Bin Zhang, Wei Chen, W. Dou, Yujin Zhang, Liming Chen
{"title":"基于内容的乒乓球游戏利用视听线索突出检测","authors":"Bin Zhang, Wei Chen, W. Dou, Yujin Zhang, Liming Chen","doi":"10.1109/ICIG.2007.78","DOIUrl":null,"url":null,"abstract":"Both audio and video are considered as important information carriers of multimedia content. In this paper, we propose an algorithm utilizing audiovisual clues for a scenario of sports game highlight detection, where the highlight detection for table tennis games are studied. Since audio and video contain different aspects of information that is helpful to locate highlights, we build two algorithms detecting highlight candidates based on audio and video, respectively, where hidden Markov model (HMM) audio keyword modeling and unsupervised shot clustering are applied. Decision fusion is invoked to combine audio and video highlight candidates and generate final highlights. Promising experimental results up to 90 % average precision are achieved.","PeriodicalId":367106,"journal":{"name":"Fourth International Conference on Image and Graphics (ICIG 2007)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Content-based Table Tennis Games Highlight Detection Utilizing Audiovisual Clues\",\"authors\":\"Bin Zhang, Wei Chen, W. Dou, Yujin Zhang, Liming Chen\",\"doi\":\"10.1109/ICIG.2007.78\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Both audio and video are considered as important information carriers of multimedia content. In this paper, we propose an algorithm utilizing audiovisual clues for a scenario of sports game highlight detection, where the highlight detection for table tennis games are studied. Since audio and video contain different aspects of information that is helpful to locate highlights, we build two algorithms detecting highlight candidates based on audio and video, respectively, where hidden Markov model (HMM) audio keyword modeling and unsupervised shot clustering are applied. Decision fusion is invoked to combine audio and video highlight candidates and generate final highlights. Promising experimental results up to 90 % average precision are achieved.\",\"PeriodicalId\":367106,\"journal\":{\"name\":\"Fourth International Conference on Image and Graphics (ICIG 2007)\",\"volume\":\"161 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fourth International Conference on Image and Graphics (ICIG 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIG.2007.78\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fourth International Conference on Image and Graphics (ICIG 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIG.2007.78","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Content-based Table Tennis Games Highlight Detection Utilizing Audiovisual Clues
Both audio and video are considered as important information carriers of multimedia content. In this paper, we propose an algorithm utilizing audiovisual clues for a scenario of sports game highlight detection, where the highlight detection for table tennis games are studied. Since audio and video contain different aspects of information that is helpful to locate highlights, we build two algorithms detecting highlight candidates based on audio and video, respectively, where hidden Markov model (HMM) audio keyword modeling and unsupervised shot clustering are applied. Decision fusion is invoked to combine audio and video highlight candidates and generate final highlights. Promising experimental results up to 90 % average precision are achieved.