Estimating vocal tract shapes of Thai vowels from contextual vowel variation

S. Prom-on, P. Birkholz, Yi Xu
{"title":"Estimating vocal tract shapes of Thai vowels from contextual vowel variation","authors":"S. Prom-on, P. Birkholz, Yi Xu","doi":"10.1109/ICSDA.2014.7051442","DOIUrl":null,"url":null,"abstract":"This paper presents a computational estimation of vocal tract shape parameters as articulatory targets of Thai vowels in an articulatory synthesizer, by means of analysis-by-synthesis with acoustic data as input. A speech corpus designed to capture the contextual variants of nine Thai long vowels, consisting of 81 disyllabic utterances, was recorded by a native Thai speaker. For each utterance, two targets, one for each syllable, were estimated by optimizing the target parameters to minimize the MFCC error between original and synthesized speech. An analysis-by-synthesis approach was used to iteratively optimize the shape parameters. The estimated targets of each vowel type were then averaged, resulting in nine articulatory targets, each corresponding to a vowel. The optimized targets were then used to synthesize Thai vowels both in monosyllables and in disyllabic sequences. The results indicate that the estimated targets effectively represent the underlying articulatory goals of Thai vowels.","PeriodicalId":361187,"journal":{"name":"2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSDA.2014.7051442","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

This paper presents a computational estimation of vocal tract shape parameters as articulatory targets of Thai vowels in an articulatory synthesizer, by means of analysis-by-synthesis with acoustic data as input. A speech corpus designed to capture the contextual variants of nine Thai long vowels, consisting of 81 disyllabic utterances, was recorded by a native Thai speaker. For each utterance, two targets, one for each syllable, were estimated by optimizing the target parameters to minimize the MFCC error between original and synthesized speech. An analysis-by-synthesis approach was used to iteratively optimize the shape parameters. The estimated targets of each vowel type were then averaged, resulting in nine articulatory targets, each corresponding to a vowel. The optimized targets were then used to synthesize Thai vowels both in monosyllables and in disyllabic sequences. The results indicate that the estimated targets effectively represent the underlying articulatory goals of Thai vowels.
根据上下文元音变化估计泰语元音的声道形状
本文提出了一种以声学数据为输入的声道形状参数作为发音合成器中泰国元音发音目标的计算估计方法。一个以泰语为母语的人录制了一个语音语料库,该语料库旨在捕捉9个泰语长元音的上下文变体,包括81个双音节话语。通过优化目标参数,对每个语音进行两个目标估计,每个音节一个目标,最大限度地减少原始语音与合成语音之间的MFCC误差。采用综合分析法对形状参数进行迭代优化。然后对每种元音类型的估计目标进行平均,得到九个发音目标,每个目标对应一个元音。然后利用优化后的目标合成单音节和双音节序列的泰语元音。结果表明,估计目标有效地代表了泰语元音的潜在发音目标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信