Split Acoustic Modeling in Decoder for Phoneme Recognition

R. Pradeep, K. S. Rao
{"title":"Split Acoustic Modeling in Decoder for Phoneme Recognition","authors":"R. Pradeep, K. S. Rao","doi":"10.1109/INDICON.2017.8487556","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) are now a central component of nearly all state-of-the-art speech recognition systems. Much of the recent research has been concentrated in reducing the computational complexity involved in DNN training by developing different architectures. However the search space of the decoder in automatic speech recognition (ASR) is huge and also it is prune to have substitution errors. In this work, we introduce split decoding mechanism by creating sonorant and obstruent acoustic models. The speech frames that are detected as sonorants and obstruents are fed only to the sonorant acoustic models and obstruent acoustic models respectively. It reduces the decoder search space in ASR and also minimises the substitution errors. The manner of sonorants that includes broadly the vowels, the semi-vowels and the nasals are detected by exploiting the spectral flatness measure (SFM) computed on the magnitude linear prediction (LP) spectrum. The proposed split decoding method based on sonority detection decreased the phone error rates by nearly 0.7% when evaluated on core TIMIT test corpus as compared to the conventional decoding involved in the state-of-the-art DNN.","PeriodicalId":263943,"journal":{"name":"2017 14th IEEE India Council International Conference (INDICON)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IEEE India Council International Conference (INDICON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDICON.2017.8487556","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Deep neural networks (DNNs) are now a central component of nearly all state-of-the-art speech recognition systems. Much of the recent research has been concentrated in reducing the computational complexity involved in DNN training by developing different architectures. However the search space of the decoder in automatic speech recognition (ASR) is huge and also it is prune to have substitution errors. In this work, we introduce split decoding mechanism by creating sonorant and obstruent acoustic models. The speech frames that are detected as sonorants and obstruents are fed only to the sonorant acoustic models and obstruent acoustic models respectively. It reduces the decoder search space in ASR and also minimises the substitution errors. The manner of sonorants that includes broadly the vowels, the semi-vowels and the nasals are detected by exploiting the spectral flatness measure (SFM) computed on the magnitude linear prediction (LP) spectrum. The proposed split decoding method based on sonority detection decreased the phone error rates by nearly 0.7% when evaluated on core TIMIT test corpus as compared to the conventional decoding involved in the state-of-the-art DNN.
音素识别译码器中的分离声学建模
深度神经网络(dnn)现在几乎是所有先进语音识别系统的核心组成部分。最近的许多研究都集中在通过开发不同的架构来降低深度神经网络训练中涉及的计算复杂性。然而,语音自动识别中解码器的搜索空间很大,容易出现替换错误。在这项工作中,我们引入了分裂解码机制,通过创建声和阻声模型。将检测为辅音和障碍物的语音帧分别仅馈送到辅音声学模型和障碍物声学模型。它减少了ASR中解码器的搜索空间,并使替换错误最小化。利用在幅度线性预测(LP)谱上计算的频谱平坦度测量(SFM)来检测包括元音、半元音和鼻音在内的各种辅音的方式。当在核心TIMIT测试语料库上进行评估时,与最先进的深度神经网络中涉及的传统解码相比,所提出的基于声音检测的分割解码方法将电话错误率降低了近0.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信