{"title":"Robust speech recognition for similar pronunciation phrases using MMSE under noise environments","authors":"Masumi Watanabe, Hiroshi Tsutsui, Y. Miyanaga","doi":"10.1109/ISCIT.2013.6645971","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a robust speech recognition method for similar pronunciation phrases. Along with the popularization of information devices such as personal computers and smart-phones, many applications controlled by voice have spread in the society. In order to increase the speech accuracy under a real environment, it is extremely important to discriminate similar pronunciation phrases. In the proposed method, linear prediction theory (LPC) is used for spectral analysis while cepstrum mean subtraction (CMS) and dynamic range adjustment (DRA) is used for a noise reduction method. The speech accuracy was recorded 68.7 % in SNR 10 dB by using the proposed methods. In conclusion, LPC+CMS/DRA is the most effective method to discriminate similar pronunciation phrases.","PeriodicalId":356009,"journal":{"name":"2013 13th International Symposium on Communications and Information Technologies (ISCIT)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 13th International Symposium on Communications and Information Technologies (ISCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCIT.2013.6645971","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In this paper, we propose a robust speech recognition method for similar pronunciation phrases. Along with the popularization of information devices such as personal computers and smart-phones, many applications controlled by voice have spread in the society. In order to increase the speech accuracy under a real environment, it is extremely important to discriminate similar pronunciation phrases. In the proposed method, linear prediction theory (LPC) is used for spectral analysis while cepstrum mean subtraction (CMS) and dynamic range adjustment (DRA) is used for a noise reduction method. The speech accuracy was recorded 68.7 % in SNR 10 dB by using the proposed methods. In conclusion, LPC+CMS/DRA is the most effective method to discriminate similar pronunciation phrases.