Gender and speech material effects on the long-term average speech spectrum, including at extended high frequencies.

IF 2.1 2区物理与天体物理 Q2 ACOUSTICS

Journal of the Acoustical Society of America Pub Date : 2024-11-01 DOI:10.1121/10.0034231

Vahid Delaram, Margaret K Miller, Rohit M Ananthanarayana, Allison Trine, Emily Buss, G Christopher Stecker, Brian B Monson

{"title":"Gender and speech material effects on the long-term average speech spectrum, including at extended high frequencies.","authors":"Vahid Delaram, Margaret K Miller, Rohit M Ananthanarayana, Allison Trine, Emily Buss, G Christopher Stecker, Brian B Monson","doi":"10.1121/10.0034231","DOIUrl":null,"url":null,"abstract":"<p><p>Gender and language effects on the long-term average speech spectrum (LTASS) have been reported, but typically using recordings that were bandlimited and/or failed to accurately capture extended high frequencies (EHFs). Accurate characterization of the full-band LTASS is warranted given recent data on the contribution of EHFs to speech perception. The present study characterized the LTASS for high-fidelity, anechoic recordings of males and females producing Bamford-Kowal-Bench sentences, digits, and unscripted narratives. Gender had an effect on spectral levels at both ends of the spectrum: males had higher levels than females below approximately 160 Hz, owing to lower fundamental frequencies; females had ∼4 dB higher levels at EHFs, but this effect was dependent on speech material. Gender differences were also observed at ∼300 Hz, and between 800 and 1000 Hz, as previously reported. Despite differences in phonetic content, there were only small, gender-dependent differences in EHF levels across speech materials. EHF levels were highly correlated across materials, indicating relative consistency within talkers. Our findings suggest that LTASS levels at EHFs are influenced primarily by talker and gender, highlighting the need for future research to assess whether EHF cues are more audible for female speech than for male speech.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"156 5","pages":"3056-3066"},"PeriodicalIF":2.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11540443/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Acoustical Society of America","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1121/10.0034231","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Gender and language effects on the long-term average speech spectrum (LTASS) have been reported, but typically using recordings that were bandlimited and/or failed to accurately capture extended high frequencies (EHFs). Accurate characterization of the full-band LTASS is warranted given recent data on the contribution of EHFs to speech perception. The present study characterized the LTASS for high-fidelity, anechoic recordings of males and females producing Bamford-Kowal-Bench sentences, digits, and unscripted narratives. Gender had an effect on spectral levels at both ends of the spectrum: males had higher levels than females below approximately 160 Hz, owing to lower fundamental frequencies; females had ∼4 dB higher levels at EHFs, but this effect was dependent on speech material. Gender differences were also observed at ∼300 Hz, and between 800 and 1000 Hz, as previously reported. Despite differences in phonetic content, there were only small, gender-dependent differences in EHF levels across speech materials. EHF levels were highly correlated across materials, indicating relative consistency within talkers. Our findings suggest that LTASS levels at EHFs are influenced primarily by talker and gender, highlighting the need for future research to assess whether EHF cues are more audible for female speech than for male speech.

查看原文本刊更多论文

性别和语音材料对长期平均语音频谱（包括扩展高频）的影响。

性别和语言对长期平均语音频谱（LTASS）的影响已有报道，但通常使用的是带限录音和/或未能准确捕捉扩展高频（EHFs）的录音。鉴于最近有关 EHFs 对语音感知贡献的数据，有必要对全频带 LTASS 进行精确表征。本研究表征了男性和女性发出 Bamford-Kowal-Bench 句子、数字和无脚本叙述的高保真消声录音的 LTASS。性别对频谱两端的频谱水平都有影响：由于基频较低，男性在大约 160 Hz 以下的频谱水平比女性高；女性在 EHFs 的频谱水平比男性高 4 dB，但这种影响取决于语音材料。如前所述，在 ∼300 Hz 和 800 至 1000 Hz 之间也观察到了性别差异。尽管语音内容存在差异，但不同语音材料的 EHF 水平仅存在微小的性别差异。不同材料之间的 EHF 水平高度相关，表明说话者内部的 EHF 水平相对一致。我们的研究结果表明，EHF 的 LTASS 水平主要受说话者和性别的影响，这突出表明今后的研究需要评估 EHF 提示是否在女性说话时比在男性说话时更容易听到。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of the Acoustical Society of America 物理-声学

CiteScore

4.60

自引率

16.70%

发文量

1433

审稿时长

4.7 months

期刊介绍： Since 1929 The Journal of the Acoustical Society of America has been the leading source of theoretical and experimental research results in the broad interdisciplinary study of sound. Subject coverage includes: linear and nonlinear acoustics; aeroacoustics, underwater sound and acoustical oceanography; ultrasonics and quantum acoustics; architectural and structural acoustics and vibration; speech, music and noise; psychology and physiology of hearing; engineering acoustics, transduction; bioacoustics, animal bioacoustics.