{"title":"基于多字典卷积非负矩阵分解的语音增强","authors":"Shan He, Jiawen Wu, Lin Li","doi":"10.1109/ICASID.2015.7405671","DOIUrl":null,"url":null,"abstract":"We introduce an effective framework for speech enhancement using convolutive nonnegative matrix factorization (CNMF) with no a prior knowledge of the clean speech or noise types. On the assumption that the speech signal is irrelevant to noisy signal, CNMF would attenuate the background noise from noisy speech with a mix dictionary which consists of speech bases and noise bases. To be practical, in training phrase, a universal speech dictionary is trained using clean speech utterances irrelevant to noisy speech, and noise dictionary is well-prepared on different kinds of noise signal. Then in the enhancement process only the speech bases are updated according to noisy utterances using Kullback-Leibler divergence (KLD). The experiments were implemented on the NOIZUES library with babble, pink and white noisy speech utterances in a wide range of input SNR from 0dB to 15dB. The results show that the proposed framework achieves better perceptual evaluation of speech quality (PESQ) scores and output SNR compared to other conventional methods.","PeriodicalId":403184,"journal":{"name":"2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Speech enhancement using convolutive non-negative matrix factorization with multiple dictionaries\",\"authors\":\"Shan He, Jiawen Wu, Lin Li\",\"doi\":\"10.1109/ICASID.2015.7405671\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce an effective framework for speech enhancement using convolutive nonnegative matrix factorization (CNMF) with no a prior knowledge of the clean speech or noise types. On the assumption that the speech signal is irrelevant to noisy signal, CNMF would attenuate the background noise from noisy speech with a mix dictionary which consists of speech bases and noise bases. To be practical, in training phrase, a universal speech dictionary is trained using clean speech utterances irrelevant to noisy speech, and noise dictionary is well-prepared on different kinds of noise signal. Then in the enhancement process only the speech bases are updated according to noisy utterances using Kullback-Leibler divergence (KLD). The experiments were implemented on the NOIZUES library with babble, pink and white noisy speech utterances in a wide range of input SNR from 0dB to 15dB. The results show that the proposed framework achieves better perceptual evaluation of speech quality (PESQ) scores and output SNR compared to other conventional methods.\",\"PeriodicalId\":403184,\"journal\":{\"name\":\"2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASID.2015.7405671\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASID.2015.7405671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speech enhancement using convolutive non-negative matrix factorization with multiple dictionaries
We introduce an effective framework for speech enhancement using convolutive nonnegative matrix factorization (CNMF) with no a prior knowledge of the clean speech or noise types. On the assumption that the speech signal is irrelevant to noisy signal, CNMF would attenuate the background noise from noisy speech with a mix dictionary which consists of speech bases and noise bases. To be practical, in training phrase, a universal speech dictionary is trained using clean speech utterances irrelevant to noisy speech, and noise dictionary is well-prepared on different kinds of noise signal. Then in the enhancement process only the speech bases are updated according to noisy utterances using Kullback-Leibler divergence (KLD). The experiments were implemented on the NOIZUES library with babble, pink and white noisy speech utterances in a wide range of input SNR from 0dB to 15dB. The results show that the proposed framework achieves better perceptual evaluation of speech quality (PESQ) scores and output SNR compared to other conventional methods.