{"title":"基于小功率增强算法的鲁棒语音识别","authors":"Chanwoo Kim, Kshitiz Kumar, R. Stern","doi":"10.1109/ASRU.2009.5373230","DOIUrl":null,"url":null,"abstract":"In this paper, we present a noise robustness algorithm called Small Power Boosting (SPB). We observe that in the spectral domain, time-frequency bins with smaller power are more affected by additive noise. The conventional way of handling this problem is estimating the noise from the test utterance and doing normalization or subtraction. In our work, in contrast, we intentionally boost the power of time-frequency bins with small energy for both the training and testing datasets. Since time-frequency bins with small power no longer exist after this power boosting, the spectral distortion between the clean and corrupt test sets becomes reduced. This type of small power boosting is also highly related to physiological nonlinearity. We observe that when small power boosting is done, suitable weighting smoothing becomes highly important. Our experimental results indicate that this simple idea is very helpful for very difficult noisy environments such as corruption by background music.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Robust speech recognition using a Small Power Boosting algorithm\",\"authors\":\"Chanwoo Kim, Kshitiz Kumar, R. Stern\",\"doi\":\"10.1109/ASRU.2009.5373230\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a noise robustness algorithm called Small Power Boosting (SPB). We observe that in the spectral domain, time-frequency bins with smaller power are more affected by additive noise. The conventional way of handling this problem is estimating the noise from the test utterance and doing normalization or subtraction. In our work, in contrast, we intentionally boost the power of time-frequency bins with small energy for both the training and testing datasets. Since time-frequency bins with small power no longer exist after this power boosting, the spectral distortion between the clean and corrupt test sets becomes reduced. This type of small power boosting is also highly related to physiological nonlinearity. We observe that when small power boosting is done, suitable weighting smoothing becomes highly important. Our experimental results indicate that this simple idea is very helpful for very difficult noisy environments such as corruption by background music.\",\"PeriodicalId\":292194,\"journal\":{\"name\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2009.5373230\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2009.5373230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Robust speech recognition using a Small Power Boosting algorithm
In this paper, we present a noise robustness algorithm called Small Power Boosting (SPB). We observe that in the spectral domain, time-frequency bins with smaller power are more affected by additive noise. The conventional way of handling this problem is estimating the noise from the test utterance and doing normalization or subtraction. In our work, in contrast, we intentionally boost the power of time-frequency bins with small energy for both the training and testing datasets. Since time-frequency bins with small power no longer exist after this power boosting, the spectral distortion between the clean and corrupt test sets becomes reduced. This type of small power boosting is also highly related to physiological nonlinearity. We observe that when small power boosting is done, suitable weighting smoothing becomes highly important. Our experimental results indicate that this simple idea is very helpful for very difficult noisy environments such as corruption by background music.