研究语音特征及其相应分布特征在鲁棒语音识别中的应用

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI:10.1109/ASRU.2007.4430089

Shih-Hsiang Lin, Yao-ming Yeh, Berlin Chen

{"title":"研究语音特征及其相应分布特征在鲁棒语音识别中的应用","authors":"Shih-Hsiang Lin, Yao-ming Yeh, Berlin Chen","doi":"10.1109/ASRU.2007.4430089","DOIUrl":null,"url":null,"abstract":"The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness over the last few decades. Related work reported in the literature can be generally divided into two aspects according to whether the orientation of the methods is either from the feature domain or from the corresponding probability distributions. In this paper, we present a polynomial regression approach which has the merit of directly characterizing the relationship between the speech features and their corresponding probability distributions to compensate the noise effects. Two variants of the proposed approach are also extensively investigated as well. All experiments are conducted on the Aurora-2 database and task. Experimental results show that for clean-condition training, our approaches achieve considerable word error rate reductions over the baseline system, and also significantly outperform other conventional methods.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition\",\"authors\":\"Shih-Hsiang Lin, Yao-ming Yeh, Berlin Chen\",\"doi\":\"10.1109/ASRU.2007.4430089\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness over the last few decades. Related work reported in the literature can be generally divided into two aspects according to whether the orientation of the methods is either from the feature domain or from the corresponding probability distributions. In this paper, we present a polynomial regression approach which has the merit of directly characterizing the relationship between the speech features and their corresponding probability distributions to compensate the noise effects. Two variants of the proposed approach are also extensively investigated as well. All experiments are conducted on the Aurora-2 database and task. Experimental results show that for clean-condition training, our approaches achieve considerable word error rate reductions over the baseline system, and also significantly outperform other conventional methods.\",\"PeriodicalId\":371729,\"journal\":{\"name\":\"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2007.4430089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2007.4430089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

当输入语音受到各种噪声源的干扰时，当前自动语音识别系统的性能往往会急剧下降。在过去的几十年里，已经提出了相当多的技术来提高ASR的鲁棒性。根据方法的方向是从特征域还是从相应的概率分布，文献中报道的相关工作大致可以分为两个方面。本文提出了一种多项式回归方法，该方法可以直接表征语音特征与其对应的概率分布之间的关系，以补偿噪声的影响。所提出的方法的两个变体也被广泛研究。所有实验均在Aurora-2数据库和任务上进行。实验结果表明，对于清洁条件训练，我们的方法在基线系统上实现了相当大的单词错误率降低，并且显著优于其他传统方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition

The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness over the last few decades. Related work reported in the literature can be generally divided into two aspects according to whether the orientation of the methods is either from the feature domain or from the corresponding probability distributions. In this paper, we present a polynomial regression approach which has the merit of directly characterizing the relationship between the speech features and their corresponding probability distributions to compensate the noise effects. Two variants of the proposed approach are also extensively investigated as well. All experiments are conducted on the Aurora-2 database and task. Experimental results show that for clean-condition training, our approaches achieve considerable word error rate reductions over the baseline system, and also significantly outperform other conventional methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

自引率

0.00%

发文量