Predicting the intelligibility of Mandarin Chinese with manipulated and intact tonal information for normal-hearing listeners.

IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS
Chenyang Xu, Brian C J Moore, Mingfang Diao, Xiaodong Li, Chengshi Zheng
{"title":"Predicting the intelligibility of Mandarin Chinese with manipulated and intact tonal information for normal-hearing listeners.","authors":"Chenyang Xu, Brian C J Moore, Mingfang Diao, Xiaodong Li, Chengshi Zheng","doi":"10.1121/10.0034233","DOIUrl":null,"url":null,"abstract":"<p><p>Objective indices for predicting speech intelligibility offer a quick and convenient alternative to behavioral measures of speech intelligibility. However, most such indices are designed for a specific language, such as English, and they do not take adequate account of tonal information in speech when applied to languages like Mandarin Chinese (hereafter called Mandarin) for which the patterns of fundamental frequency (F0) variation play an important role in distinguishing speech sounds with similar phonetic content. To address this, two experiments with normal-hearing listeners were conducted examining: (1) The impact of manipulations of tonal information on the intelligibility of Mandarin sentences presented in speech-shaped noise (SSN) at several signal-to-noise ratios (SNRs); (2) The intelligibility of Mandarin sentences with intact tonal information presented in SSN, pink noise, and babble at several SNRs. The outcomes were not correctly predicted by the Hearing Aid Speech Perception Index (HASPI-V1). A new intelligibility metric was developed that used one acoustic feature from HASPI-V1 plus Hilbert time envelope and temporal fine structure information from multiple frequency bands. For the new metric, the Pearson correlation between obtained and predicted intelligibility was 0.923 and the root mean square error was 0.119. The new metric provides a potential tool for evaluating Mandarin intelligibility.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"156 5","pages":"3088-3101"},"PeriodicalIF":2.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Acoustical Society of America","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1121/10.0034233","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Objective indices for predicting speech intelligibility offer a quick and convenient alternative to behavioral measures of speech intelligibility. However, most such indices are designed for a specific language, such as English, and they do not take adequate account of tonal information in speech when applied to languages like Mandarin Chinese (hereafter called Mandarin) for which the patterns of fundamental frequency (F0) variation play an important role in distinguishing speech sounds with similar phonetic content. To address this, two experiments with normal-hearing listeners were conducted examining: (1) The impact of manipulations of tonal information on the intelligibility of Mandarin sentences presented in speech-shaped noise (SSN) at several signal-to-noise ratios (SNRs); (2) The intelligibility of Mandarin sentences with intact tonal information presented in SSN, pink noise, and babble at several SNRs. The outcomes were not correctly predicted by the Hearing Aid Speech Perception Index (HASPI-V1). A new intelligibility metric was developed that used one acoustic feature from HASPI-V1 plus Hilbert time envelope and temporal fine structure information from multiple frequency bands. For the new metric, the Pearson correlation between obtained and predicted intelligibility was 0.923 and the root mean square error was 0.119. The new metric provides a potential tool for evaluating Mandarin intelligibility.

预测普通话的可懂度,为听力正常的听者提供经过处理的和完整的音调信息。
预测言语清晰度的客观指数为言语清晰度的行为测量提供了一种快速、方便的替代方法。然而,大多数此类指数都是针对特定语言(如英语)设计的,在应用于汉语普通话(以下简称普通话)等语言时,它们没有充分考虑语音中的音调信息,而在普通话中,基频(F0)的变化模式在区分具有相似语音内容的语音方面发挥着重要作用。为了解决这个问题,我们以听力正常的听者为对象进行了两项实验:(1) 在不同信噪比(SNR)的语音噪声(SSN)中,音调信息的处理对普通话句子可懂度的影响;(2) 在不同信噪比(SNR)的语音噪声、粉红噪声和咿呀声中,音调信息完好的普通话句子的可懂度。助听器言语感知指数(HASPI-V1)无法正确预测结果。我们开发了一种新的可懂度指标,它使用了 HASPI-V1 中的一个声学特征以及多个频段的希尔伯特时间包络和时间精细结构信息。就新指标而言,获得的可懂度与预测的可懂度之间的皮尔逊相关性为 0.923,均方根误差为 0.119。新指标为评估普通话可懂度提供了一个潜在的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.60
自引率
16.70%
发文量
1433
审稿时长
4.7 months
期刊介绍: Since 1929 The Journal of the Acoustical Society of America has been the leading source of theoretical and experimental research results in the broad interdisciplinary study of sound. Subject coverage includes: linear and nonlinear acoustics; aeroacoustics, underwater sound and acoustical oceanography; ultrasonics and quantum acoustics; architectural and structural acoustics and vibration; speech, music and noise; psychology and physiology of hearing; engineering acoustics, transduction; bioacoustics, animal bioacoustics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信