{"title":"Advanced speech enhancement with partial speech reconstruction","authors":"P. Hannon, M. Krini, Ingo Schalk-Schupp","doi":"10.5281/ZENODO.43592","DOIUrl":null,"url":null,"abstract":"An advanced speech enhancement algorithm is proposed, which employs partial speech reconstruction of highly disturbed speech. The speech reconstruction algorithms assume the source-filter model of speech production and construct estimates of clean speech source and filter signals using features extracted from noisy input. A nonlinear harmonic regeneration scheme for source signals is presented followed by two methods for the estimation of the vocal tract filter characteristics. The quantization method applies a priori trained codebooks using clean speech training data and the parametric estimation method assumes a parabolic continuation of low frequency envelope values. The predicted speech quality of the enhanced speech output is assessed with composite objective measures, while the accuracy of the spectral envelope estimations is analyzed with the log-spectral distance over four manually generated signal-to-noise ratio scenarios.","PeriodicalId":400766,"journal":{"name":"21st European Signal Processing Conference (EUSIPCO 2013)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"21st European Signal Processing Conference (EUSIPCO 2013)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/ZENODO.43592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
An advanced speech enhancement algorithm is proposed, which employs partial speech reconstruction of highly disturbed speech. The speech reconstruction algorithms assume the source-filter model of speech production and construct estimates of clean speech source and filter signals using features extracted from noisy input. A nonlinear harmonic regeneration scheme for source signals is presented followed by two methods for the estimation of the vocal tract filter characteristics. The quantization method applies a priori trained codebooks using clean speech training data and the parametric estimation method assumes a parabolic continuation of low frequency envelope values. The predicted speech quality of the enhanced speech output is assessed with composite objective measures, while the accuracy of the spectral envelope estimations is analyzed with the log-spectral distance over four manually generated signal-to-noise ratio scenarios.