{"title":"Experimental study on speech enhancement using DNN with perceptual weighting","authors":"Wenhua Shi, Xiongwei Zhang, Xia Zou, Meng Sun","doi":"10.1145/3290420.3290465","DOIUrl":null,"url":null,"abstract":"Based on the phenomenon that auditory system is not easily distinguish quantization noise from high energy region of spectrum, an experimental study on speech enhancement using deep neural network with perceptual weighting is presented in this paper. The error criterion of auditory weighting, which is widely used in low bit rate vo-coder is employed by applying a filter on the error spectrum. The filter has a shape of the inverse spectrum with the original signal. Deep neural network is used to learn the nonlinear mapping form the noisy speech signal to the clean speech by minimizing the weighted error spectrum between the estimated speech and target speech. Experimental study is implemented on the TIMIT database corrupted by unmatched noise during training and test stage. The results demonstrate that the proposed method outperform baseline methods in terms of perceptual evaluation of speech quality, log-spectral distortion in most types of noise.","PeriodicalId":259201,"journal":{"name":"International Conference on Critical Infrastructure Protection","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Critical Infrastructure Protection","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3290420.3290465","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Based on the phenomenon that auditory system is not easily distinguish quantization noise from high energy region of spectrum, an experimental study on speech enhancement using deep neural network with perceptual weighting is presented in this paper. The error criterion of auditory weighting, which is widely used in low bit rate vo-coder is employed by applying a filter on the error spectrum. The filter has a shape of the inverse spectrum with the original signal. Deep neural network is used to learn the nonlinear mapping form the noisy speech signal to the clean speech by minimizing the weighted error spectrum between the estimated speech and target speech. Experimental study is implemented on the TIMIT database corrupted by unmatched noise during training and test stage. The results demonstrate that the proposed method outperform baseline methods in terms of perceptual evaluation of speech quality, log-spectral distortion in most types of noise.