{"title":"基于深度神经网络的语音去混响功率谱密度感知方法","authors":"Yuanlei Qi, Feiran Yang, Jun Yang","doi":"10.1109/APSIPAASC47483.2019.9023202","DOIUrl":null,"url":null,"abstract":"In recent years, a variety of speech dereverberation algorithms based on deep neural network (DNN) have been proposed. These algorithms usually adopt anechoic speech as their target output. Consequently, speech distortion might occur which impairs the speech intelligibility. As a matter of fact, early reflections can increase the strength of the direct-path sound and therefore have a positive impact on the speech intelligibility. In traditional speech dereverberation methods, early reflections are generally remained together with the direct-path sound. Based on these observations, we propose to adopt both direct-path sound and early reflections as the target DNN output in this paper. Moreover, we propose a late reverberation power spectral density (PSD) aware training strategy to further suppress the late reverberation. Experimental results demonstrate that the proposed DNN framework achieves significant improvement in objective measures even under mismatched conditions.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Late Reverberation Power Spectral Density Aware Approach to Speech Dereverberation Based on Deep Neural Networks\",\"authors\":\"Yuanlei Qi, Feiran Yang, Jun Yang\",\"doi\":\"10.1109/APSIPAASC47483.2019.9023202\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, a variety of speech dereverberation algorithms based on deep neural network (DNN) have been proposed. These algorithms usually adopt anechoic speech as their target output. Consequently, speech distortion might occur which impairs the speech intelligibility. As a matter of fact, early reflections can increase the strength of the direct-path sound and therefore have a positive impact on the speech intelligibility. In traditional speech dereverberation methods, early reflections are generally remained together with the direct-path sound. Based on these observations, we propose to adopt both direct-path sound and early reflections as the target DNN output in this paper. Moreover, we propose a late reverberation power spectral density (PSD) aware training strategy to further suppress the late reverberation. Experimental results demonstrate that the proposed DNN framework achieves significant improvement in objective measures even under mismatched conditions.\",\"PeriodicalId\":145222,\"journal\":{\"name\":\"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APSIPAASC47483.2019.9023202\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPAASC47483.2019.9023202","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Late Reverberation Power Spectral Density Aware Approach to Speech Dereverberation Based on Deep Neural Networks
In recent years, a variety of speech dereverberation algorithms based on deep neural network (DNN) have been proposed. These algorithms usually adopt anechoic speech as their target output. Consequently, speech distortion might occur which impairs the speech intelligibility. As a matter of fact, early reflections can increase the strength of the direct-path sound and therefore have a positive impact on the speech intelligibility. In traditional speech dereverberation methods, early reflections are generally remained together with the direct-path sound. Based on these observations, we propose to adopt both direct-path sound and early reflections as the target DNN output in this paper. Moreover, we propose a late reverberation power spectral density (PSD) aware training strategy to further suppress the late reverberation. Experimental results demonstrate that the proposed DNN framework achieves significant improvement in objective measures even under mismatched conditions.