Pronunciation modeling for speech technology

T. Svendsen
{"title":"Pronunciation modeling for speech technology","authors":"T. Svendsen","doi":"10.1109/SPCOM.2004.1458347","DOIUrl":null,"url":null,"abstract":"Written text is based on an orthographic representation of words, i.e. linear sequences of letters. Modern speech technology (automatic speech recognition and text-to-speech synthesis) is based on phonetic units representing realization of sounds. A mapping between the orthographic form and phonetic forms representing the pronunciation is thus required. This may be obtained by creating pronunciation lexica and/or rule-based systems for grapheme-to-phoneme conversion. Traditionally, this mapping has been obtained manually, based on phonetic and linguistic knowledge. This approach has a number of drawbacks: i) the pronunciations represent typical pronunciations and will have a limited capacity for describing pronunciation variation due to speaking style and dialectical/accent variations; ii) if multiple pronunciation variants are included, it does not indicate which variants are more significant for the specific application; iii) the description is based on phonetic-knowledge and does not take into account that the units used in speech technology may deviate from the phonetic interpretation; and iv) the description is limited to units with a linguistic interpretation. The paper will present and discuss methods for modeling pronunciation and pronunciation variation specifically for applications in speech technology.","PeriodicalId":424981,"journal":{"name":"2004 International Conference on Signal Processing and Communications, 2004. SPCOM '04.","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 International Conference on Signal Processing and Communications, 2004. SPCOM '04.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM.2004.1458347","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Written text is based on an orthographic representation of words, i.e. linear sequences of letters. Modern speech technology (automatic speech recognition and text-to-speech synthesis) is based on phonetic units representing realization of sounds. A mapping between the orthographic form and phonetic forms representing the pronunciation is thus required. This may be obtained by creating pronunciation lexica and/or rule-based systems for grapheme-to-phoneme conversion. Traditionally, this mapping has been obtained manually, based on phonetic and linguistic knowledge. This approach has a number of drawbacks: i) the pronunciations represent typical pronunciations and will have a limited capacity for describing pronunciation variation due to speaking style and dialectical/accent variations; ii) if multiple pronunciation variants are included, it does not indicate which variants are more significant for the specific application; iii) the description is based on phonetic-knowledge and does not take into account that the units used in speech technology may deviate from the phonetic interpretation; and iv) the description is limited to units with a linguistic interpretation. The paper will present and discuss methods for modeling pronunciation and pronunciation variation specifically for applications in speech technology.
语音技术的发音建模
书面文本是基于单词的正字法表示,即字母的线性序列。现代语音技术(自动语音识别和文本到语音的合成)是基于表示声音实现的语音单位。因此,表示发音的正字法形式和语音形式之间的映射是必需的。这可以通过创建发音词典和/或基于规则的系统来实现字素到音素的转换。传统上,这种映射是基于语音和语言知识手工获得的。这种方法有一些缺点:1)发音代表典型的发音,由于说话风格和辩证/口音的变化,描述发音变化的能力有限;Ii)如果包含多个发音变体,则没有说明哪些变体对具体应用更重要;Iii)描述是基于语音知识的,没有考虑到语音技术中使用的单位可能偏离语音解释;iv)描述仅限于具有语言解释的单位。本文将介绍和讨论语音建模和语音变化的方法,特别是在语音技术中的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信