噪声语音识别中mel频率倒谱系数的补偿

Australasian Computer Science Conference Pub Date : 1900-01-01 DOI:10.1145/1151699.1151705

E. Choi

{"title":"噪声语音识别中mel频率倒谱系数的补偿","authors":"E. Choi","doi":"10.1145/1151699.1151705","DOIUrl":null,"url":null,"abstract":"This paper describes a novel noise-robust automatic speech recognition (ASR) front-end that employs a combination of Mel-filterbank output compensation and cumulative distribution mapping of cepstral coefficients with truncated Gaussian distribution. Recognition experiments on the Aurora II connected digits database reveal that the proposed front-end achieves an average digit recognition accuracy of 84.92% for a model set trained from clean speech data. Compared with the ETSI standard Mel-cepstral front-end, the proposed front-end is found to obtain a relative error rate reduction of around 61%. Moreover, the proposed front-end can provide comparable recognition accuracy with the ETSI advanced front-end, at less than half the computation load.","PeriodicalId":136130,"journal":{"name":"Australasian Computer Science Conference","volume":"194 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"On compensating the Mel-frequency cepstral coefficients for noisy speech recognition\",\"authors\":\"E. Choi\",\"doi\":\"10.1145/1151699.1151705\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes a novel noise-robust automatic speech recognition (ASR) front-end that employs a combination of Mel-filterbank output compensation and cumulative distribution mapping of cepstral coefficients with truncated Gaussian distribution. Recognition experiments on the Aurora II connected digits database reveal that the proposed front-end achieves an average digit recognition accuracy of 84.92% for a model set trained from clean speech data. Compared with the ETSI standard Mel-cepstral front-end, the proposed front-end is found to obtain a relative error rate reduction of around 61%. Moreover, the proposed front-end can provide comparable recognition accuracy with the ETSI advanced front-end, at less than half the computation load.\",\"PeriodicalId\":136130,\"journal\":{\"name\":\"Australasian Computer Science Conference\",\"volume\":\"194 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Australasian Computer Science Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1151699.1151705\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Australasian Computer Science Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1151699.1151705","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

摘要

本文提出了一种新的抗噪自动语音识别(ASR)前端，该前端采用mel滤波器组输出补偿和截断高斯分布的倒谱系数累积分布映射相结合的方法。在Aurora II连接数字数据库上的识别实验表明，对于干净语音数据训练的模型集，所提前端的平均数字识别准确率达到84.92%。与ETSI标准mel -倒谱前端相比，该前端的相对错误率降低了61%左右。此外，所提出的前端可以提供与ETSI高级前端相当的识别精度，而计算负荷不到ETSI高级前端的一半。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On compensating the Mel-frequency cepstral coefficients for noisy speech recognition

This paper describes a novel noise-robust automatic speech recognition (ASR) front-end that employs a combination of Mel-filterbank output compensation and cumulative distribution mapping of cepstral coefficients with truncated Gaussian distribution. Recognition experiments on the Aurora II connected digits database reveal that the proposed front-end achieves an average digit recognition accuracy of 84.92% for a model set trained from clean speech data. Compared with the ETSI standard Mel-cepstral front-end, the proposed front-end is found to obtain a relative error rate reduction of around 61%. Moreover, the proposed front-end can provide comparable recognition accuracy with the ETSI advanced front-end, at less than half the computation load.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Australasian Computer Science Conference

自引率

0.00%

发文量