Investigating the Downstream Impact of Grapheme-Based Acoustic Modeling on Spoken Utterance Classification

Ryan Price, Bhargav Srinivas Ch, Surbhi Singhal, S. Bangalore
{"title":"Investigating the Downstream Impact of Grapheme-Based Acoustic Modeling on Spoken Utterance Classification","authors":"Ryan Price, Bhargav Srinivas Ch, Surbhi Singhal, S. Bangalore","doi":"10.1109/SLT.2018.8639549","DOIUrl":null,"url":null,"abstract":"Automatic speech recognition (ASR) and natural language understanding are critical components of spoken language understanding (SLU) systems. One obstacle to providing services with SLU systems in multiple languages is the cost associated with acquiring all of the language-specific resources required for ASR in each language. Modeling graphemes eliminates the need to obtain a pronunciation dictionary which maps from speech sounds to words and is one way to reduce ASR resource dependencies when rapidly developing ASR in new languages. However, little is known about the downstream impact on SLU task performance when selecting graphemes as the acoustic modeling unit. This work investigates acoustic modeling for the ASR component of an SLU system using grapheme-based approaches together with convolutional and recurrent neural network architectures. We evaluate both ASR word accuracy and spoken utterance classification (SUC) accuracy for English, Italian and Spanish language tasks and find that it is possible to achieve SUC accuracy that is comparable to conventional phoneme-based systems which leverage a pronunciation dictionary.","PeriodicalId":377307,"journal":{"name":"2018 IEEE Spoken Language Technology Workshop (SLT)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2018.8639549","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Automatic speech recognition (ASR) and natural language understanding are critical components of spoken language understanding (SLU) systems. One obstacle to providing services with SLU systems in multiple languages is the cost associated with acquiring all of the language-specific resources required for ASR in each language. Modeling graphemes eliminates the need to obtain a pronunciation dictionary which maps from speech sounds to words and is one way to reduce ASR resource dependencies when rapidly developing ASR in new languages. However, little is known about the downstream impact on SLU task performance when selecting graphemes as the acoustic modeling unit. This work investigates acoustic modeling for the ASR component of an SLU system using grapheme-based approaches together with convolutional and recurrent neural network architectures. We evaluate both ASR word accuracy and spoken utterance classification (SUC) accuracy for English, Italian and Spanish language tasks and find that it is possible to achieve SUC accuracy that is comparable to conventional phoneme-based systems which leverage a pronunciation dictionary.
基于石墨烯的声学建模对语音分类的影响研究
自动语音识别(ASR)和自然语言理解是口语理解系统的关键组成部分。使用多种语言的SLU系统提供服务的一个障碍是获取每种语言的ASR所需的所有特定语言资源的成本。字素建模消除了获取从语音到单词映射的发音字典的需要,是在新语言中快速发展ASR时减少对ASR资源依赖的一种方法。然而,当选择石墨烯作为声学建模单元时,对SLU任务性能的下游影响知之甚少。本研究使用基于石墨烯的方法以及卷积和循环神经网络架构研究了SLU系统ASR组件的声学建模。我们对英语、意大利语和西班牙语任务的ASR单词准确性和语音分类(SUC)准确性进行了评估,发现有可能实现与利用语音字典的传统基于音素的系统相当的SUC准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信