{"title":"突出显示音频流中的声音效果检测","authors":"Rui Cai, Lie Lu, HongJiang Zhang, Lianhong Cai","doi":"10.1109/ICME.2003.1221242","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of highlight sound effects detection in audio stream, which is very useful in fields of video summarization and highlight extraction. Unlike researches on audio segmentation and classification, in this domain, it just locates those highlight sound effects in audio stream. An extensible framework is proposed and in current system three sound effects are considered: laughter, applause and cheer, which are tied up with highlight events in entertainments, sports, meetings and home videos. HMMs are used to model these sound effects and a log-likelihood scores based method is used to make final decision. A sound effect attention model is also proposed to extend general audio attention model for highlight extraction and video summarization. Evaluations on a 2-hours audio database showed very encouraging results.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"136","resultStr":"{\"title\":\"Highlight sound effects detection in audio stream\",\"authors\":\"Rui Cai, Lie Lu, HongJiang Zhang, Lianhong Cai\",\"doi\":\"10.1109/ICME.2003.1221242\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper addresses the problem of highlight sound effects detection in audio stream, which is very useful in fields of video summarization and highlight extraction. Unlike researches on audio segmentation and classification, in this domain, it just locates those highlight sound effects in audio stream. An extensible framework is proposed and in current system three sound effects are considered: laughter, applause and cheer, which are tied up with highlight events in entertainments, sports, meetings and home videos. HMMs are used to model these sound effects and a log-likelihood scores based method is used to make final decision. A sound effect attention model is also proposed to extend general audio attention model for highlight extraction and video summarization. Evaluations on a 2-hours audio database showed very encouraging results.\",\"PeriodicalId\":118560,\"journal\":{\"name\":\"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"136\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2003.1221242\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2003.1221242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper addresses the problem of highlight sound effects detection in audio stream, which is very useful in fields of video summarization and highlight extraction. Unlike researches on audio segmentation and classification, in this domain, it just locates those highlight sound effects in audio stream. An extensible framework is proposed and in current system three sound effects are considered: laughter, applause and cheer, which are tied up with highlight events in entertainments, sports, meetings and home videos. HMMs are used to model these sound effects and a log-likelihood scores based method is used to make final decision. A sound effect attention model is also proposed to extend general audio attention model for highlight extraction and video summarization. Evaluations on a 2-hours audio database showed very encouraging results.