Robust speech recognition by properly utilizing reliable frames and segments in corrupted signals

Yi Chen, C. Wan, Lin-Shan Lee
{"title":"Robust speech recognition by properly utilizing reliable frames and segments in corrupted signals","authors":"Yi Chen, C. Wan, Lin-Shan Lee","doi":"10.1109/ASRU.2007.4430091","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a new approach to detecting and utilizing reliable frames and segments in corrupted signals for robust speech recognition. Novel approaches to estimating an energy-based measure and a harmonicity measure for each frame are developed. SNR-dependent GMM classifiers are then trained, together with a reliable frame selection and clustering module and a reliable segment identification module, to detect the most reliable frames in an utterance. These reliable frames and segments thus obtained can be properly used in both front-end feature enhancement and back-end Viterbi decoding. In the extensive experiments reported here, very significant improvements in recognition accuracies were obtained with the proposed approaches for all types of noise and all SNR values defined in the Aurora 2 database.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2007.4430091","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In this paper, we propose a new approach to detecting and utilizing reliable frames and segments in corrupted signals for robust speech recognition. Novel approaches to estimating an energy-based measure and a harmonicity measure for each frame are developed. SNR-dependent GMM classifiers are then trained, together with a reliable frame selection and clustering module and a reliable segment identification module, to detect the most reliable frames in an utterance. These reliable frames and segments thus obtained can be properly used in both front-end feature enhancement and back-end Viterbi decoding. In the extensive experiments reported here, very significant improvements in recognition accuracies were obtained with the proposed approaches for all types of noise and all SNR values defined in the Aurora 2 database.
通过在损坏信号中适当地利用可靠的帧和段来实现鲁棒语音识别
在本文中,我们提出了一种新的方法来检测和利用可靠的帧和段在损坏的信号鲁棒语音识别。提出了一种新的方法来估计基于能量的度量和每帧的谐波度量。然后训练依赖于信噪比的GMM分类器,以及可靠的帧选择和聚类模块和可靠的片段识别模块,以检测话语中最可靠的帧。这些可靠的帧和段既可以用于前端特征增强,也可以用于后端Viterbi解码。在这里报道的大量实验中,对于极光2号数据库中定义的所有类型的噪声和所有信噪比值,所提出的方法都获得了非常显著的识别精度提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信