基于多字典卷积非负矩阵分解的语音增强

2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID) Pub Date : 2015-09-01 DOI:10.1109/ICASID.2015.7405671

Shan He, Jiawen Wu, Lin Li

{"title":"基于多字典卷积非负矩阵分解的语音增强","authors":"Shan He, Jiawen Wu, Lin Li","doi":"10.1109/ICASID.2015.7405671","DOIUrl":null,"url":null,"abstract":"We introduce an effective framework for speech enhancement using convolutive nonnegative matrix factorization (CNMF) with no a prior knowledge of the clean speech or noise types. On the assumption that the speech signal is irrelevant to noisy signal, CNMF would attenuate the background noise from noisy speech with a mix dictionary which consists of speech bases and noise bases. To be practical, in training phrase, a universal speech dictionary is trained using clean speech utterances irrelevant to noisy speech, and noise dictionary is well-prepared on different kinds of noise signal. Then in the enhancement process only the speech bases are updated according to noisy utterances using Kullback-Leibler divergence (KLD). The experiments were implemented on the NOIZUES library with babble, pink and white noisy speech utterances in a wide range of input SNR from 0dB to 15dB. The results show that the proposed framework achieves better perceptual evaluation of speech quality (PESQ) scores and output SNR compared to other conventional methods.","PeriodicalId":403184,"journal":{"name":"2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Speech enhancement using convolutive non-negative matrix factorization with multiple dictionaries\",\"authors\":\"Shan He, Jiawen Wu, Lin Li\",\"doi\":\"10.1109/ICASID.2015.7405671\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce an effective framework for speech enhancement using convolutive nonnegative matrix factorization (CNMF) with no a prior knowledge of the clean speech or noise types. On the assumption that the speech signal is irrelevant to noisy signal, CNMF would attenuate the background noise from noisy speech with a mix dictionary which consists of speech bases and noise bases. To be practical, in training phrase, a universal speech dictionary is trained using clean speech utterances irrelevant to noisy speech, and noise dictionary is well-prepared on different kinds of noise signal. Then in the enhancement process only the speech bases are updated according to noisy utterances using Kullback-Leibler divergence (KLD). The experiments were implemented on the NOIZUES library with babble, pink and white noisy speech utterances in a wide range of input SNR from 0dB to 15dB. The results show that the proposed framework achieves better perceptual evaluation of speech quality (PESQ) scores and output SNR compared to other conventional methods.\",\"PeriodicalId\":403184,\"journal\":{\"name\":\"2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASID.2015.7405671\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASID.2015.7405671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们引入了一个有效的框架，使用卷积非负矩阵分解(CNMF)来增强语音，而不需要事先了解干净的语音或噪声类型。在假设语音信号与噪声信号无关的前提下，CNMF利用由语音基和噪声基组成的混合字典来衰减噪声语音的背景噪声。在训练短语中，使用与噪声无关的干净语音进行通用语音词典的训练，并针对不同类型的噪声信号编制噪声词典。然后在增强过程中，仅根据有噪声的话语使用Kullback-Leibler散度(KLD)更新语音基。实验在NOIZUES库上进行，在0 ~ 15dB的宽输入信噪比范围内，含杂音、粉噪和白噪语音。结果表明，与其他传统方法相比，该框架可以更好地感知语音质量(PESQ)分数和输出信噪比。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Speech enhancement using convolutive non-negative matrix factorization with multiple dictionaries

We introduce an effective framework for speech enhancement using convolutive nonnegative matrix factorization (CNMF) with no a prior knowledge of the clean speech or noise types. On the assumption that the speech signal is irrelevant to noisy signal, CNMF would attenuate the background noise from noisy speech with a mix dictionary which consists of speech bases and noise bases. To be practical, in training phrase, a universal speech dictionary is trained using clean speech utterances irrelevant to noisy speech, and noise dictionary is well-prepared on different kinds of noise signal. Then in the enhancement process only the speech bases are updated according to noisy utterances using Kullback-Leibler divergence (KLD). The experiments were implemented on the NOIZUES library with babble, pink and white noisy speech utterances in a wide range of input SNR from 0dB to 15dB. The results show that the proposed framework achieves better perceptual evaluation of speech quality (PESQ) scores and output SNR compared to other conventional methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification (ASID)

自引率

0.00%

发文量