{"title":"基于WFST的电影音频场景识别","authors":"Jichen Yang, Min Cai, Yanxiong Li, Hai Jin","doi":"10.1109/ICALIP.2016.7846543","DOIUrl":null,"url":null,"abstract":"In order to improve movie audio scene (MAS) recognition accuracy, weighted finite-state transducer (WFST) is proposed to recognize MAS in this paper. WFST is introduced firstly, how to construct WFST is introduced secondly, WFST is used to recognize MAS using FBANK, MFCC and PLPCC, separately. The experimental results on twenty MASs using the three features shows that WFST can recognize MAS well, FBANK feature performs better than MFCC and PLPCC, which can reach 79.9%.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Movie audio scene recognition based on WFST\",\"authors\":\"Jichen Yang, Min Cai, Yanxiong Li, Hai Jin\",\"doi\":\"10.1109/ICALIP.2016.7846543\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to improve movie audio scene (MAS) recognition accuracy, weighted finite-state transducer (WFST) is proposed to recognize MAS in this paper. WFST is introduced firstly, how to construct WFST is introduced secondly, WFST is used to recognize MAS using FBANK, MFCC and PLPCC, separately. The experimental results on twenty MASs using the three features shows that WFST can recognize MAS well, FBANK feature performs better than MFCC and PLPCC, which can reach 79.9%.\",\"PeriodicalId\":184170,\"journal\":{\"name\":\"2016 International Conference on Audio, Language and Image Processing (ICALIP)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Audio, Language and Image Processing (ICALIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICALIP.2016.7846543\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICALIP.2016.7846543","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In order to improve movie audio scene (MAS) recognition accuracy, weighted finite-state transducer (WFST) is proposed to recognize MAS in this paper. WFST is introduced firstly, how to construct WFST is introduced secondly, WFST is used to recognize MAS using FBANK, MFCC and PLPCC, separately. The experimental results on twenty MASs using the three features shows that WFST can recognize MAS well, FBANK feature performs better than MFCC and PLPCC, which can reach 79.9%.