语音建模中的音素分布评估

J. A. Parra, C. Calvache, M. Zañartu
{"title":"语音建模中的音素分布评估","authors":"J. A. Parra, C. Calvache, M. Zañartu","doi":"10.1117/12.2670042","DOIUrl":null,"url":null,"abstract":"Phonetically balanced texts are used to study different voice and speech characteristics. In the context of clinical work and research, these texts provide a standard for quantifying perceptual, acoustic, or aerodynamic assessments. Recent modeling efforts are being devoted to describing long-term speech behaviors based on a collection of sustained phonemes. However, comprehensive descriptions of phoneme distributions representative of connected speech are not readily available. Thus, the present study introduces a method to estimate phoneme distributions using text data mining, as an alternative to existing power law methods. The procedure used for the decomposition of texts into phonemes, the estimation of the phonetic distributions and the comparisons between different texts, conversational speech, and standard reading passages are discussed. The results are presented using histograms and R-squared determination coefficients for the case of the English language, although the approach can be easily applied for other languages. A discussion of the proposed method, results, and limitations is presented.","PeriodicalId":147201,"journal":{"name":"Symposium on Medical Information Processing and Analysis","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing phoneme distribution for speech modeling\",\"authors\":\"J. A. Parra, C. Calvache, M. Zañartu\",\"doi\":\"10.1117/12.2670042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Phonetically balanced texts are used to study different voice and speech characteristics. In the context of clinical work and research, these texts provide a standard for quantifying perceptual, acoustic, or aerodynamic assessments. Recent modeling efforts are being devoted to describing long-term speech behaviors based on a collection of sustained phonemes. However, comprehensive descriptions of phoneme distributions representative of connected speech are not readily available. Thus, the present study introduces a method to estimate phoneme distributions using text data mining, as an alternative to existing power law methods. The procedure used for the decomposition of texts into phonemes, the estimation of the phonetic distributions and the comparisons between different texts, conversational speech, and standard reading passages are discussed. The results are presented using histograms and R-squared determination coefficients for the case of the English language, although the approach can be easily applied for other languages. A discussion of the proposed method, results, and limitations is presented.\",\"PeriodicalId\":147201,\"journal\":{\"name\":\"Symposium on Medical Information Processing and Analysis\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Symposium on Medical Information Processing and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2670042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Symposium on Medical Information Processing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2670042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

语音平衡文本用于研究不同的语音和言语特征。在临床工作和研究的背景下,这些文本提供了量化感知,声学或空气动力学评估的标准。最近的建模工作致力于描述基于持续音素集合的长期言语行为。然而,对连接语音的音素分布的全面描述并不容易获得。因此,本研究引入了一种使用文本数据挖掘来估计音素分布的方法,作为现有幂律方法的替代方法。本文讨论了将文本分解为音素的过程、语音分布的估计以及不同文本、会话语音和标准阅读段落之间的比较。虽然这种方法可以很容易地应用于其他语言,但对于英语来说,结果是使用直方图和r平方决定系数来呈现的。讨论了所提出的方法、结果和局限性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Assessing phoneme distribution for speech modeling
Phonetically balanced texts are used to study different voice and speech characteristics. In the context of clinical work and research, these texts provide a standard for quantifying perceptual, acoustic, or aerodynamic assessments. Recent modeling efforts are being devoted to describing long-term speech behaviors based on a collection of sustained phonemes. However, comprehensive descriptions of phoneme distributions representative of connected speech are not readily available. Thus, the present study introduces a method to estimate phoneme distributions using text data mining, as an alternative to existing power law methods. The procedure used for the decomposition of texts into phonemes, the estimation of the phonetic distributions and the comparisons between different texts, conversational speech, and standard reading passages are discussed. The results are presented using histograms and R-squared determination coefficients for the case of the English language, although the approach can be easily applied for other languages. A discussion of the proposed method, results, and limitations is presented.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信