{"title":"Energy contour enhancement for noisy speech recognition","authors":"Tai-Hwei Hwang, Sen-Chia Chang","doi":"10.1109/CHINSL.2004.1409633","DOIUrl":null,"url":null,"abstract":"Environmental noise, known as an additive noise, not only corrupts the spectra of a speech signal but also blurs the shape of its energy contour. The corruption of the energy contour can distort the energy derived feature and degrade the pattern classification performance of noisy speech. To reduce the distortion of the energy feature, the energy bias in the energy contour has to be removed before the feature extraction. For this purpose, we propose two methods to estimate the noise energy; one is obtained from the speech inactive period, and one is from the noisy speech itself. The methods are evaluated by the connected digit recognition of TIDigits, in which the test speech is corrupted with white noise, babble, factory noise, and in-car noises. As shown in the experiments, the energy enhancement can provide an additional improvement when it is jointly applied with a spectral subtraction.","PeriodicalId":212562,"journal":{"name":"2004 International Symposium on Chinese Spoken Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CHINSL.2004.1409633","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Environmental noise, known as an additive noise, not only corrupts the spectra of a speech signal but also blurs the shape of its energy contour. The corruption of the energy contour can distort the energy derived feature and degrade the pattern classification performance of noisy speech. To reduce the distortion of the energy feature, the energy bias in the energy contour has to be removed before the feature extraction. For this purpose, we propose two methods to estimate the noise energy; one is obtained from the speech inactive period, and one is from the noisy speech itself. The methods are evaluated by the connected digit recognition of TIDigits, in which the test speech is corrupted with white noise, babble, factory noise, and in-car noises. As shown in the experiments, the energy enhancement can provide an additional improvement when it is jointly applied with a spectral subtraction.