Effects of Temporal Envelope Cutoff Frequency, Number of Channels, and Carrier Type on Brainstem Neural Representation of Pitch in Vocoded Speech.

IF 2.2
Saradha Ananthakrishnan, Xin Luo
{"title":"Effects of Temporal Envelope Cutoff Frequency, Number of Channels, and Carrier Type on Brainstem Neural Representation of Pitch in Vocoded Speech.","authors":"Saradha Ananthakrishnan,&nbsp;Xin Luo","doi":"10.1044/2022_JSLHR-21-00576","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The objective of this study was to determine if and how the subcortical neural representation of pitch cues in listeners with normal hearing is affected by systematic manipulation of vocoder parameters.</p><p><strong>Method: </strong>This study assessed the effects of temporal envelope cutoff frequency (50 and 500 Hz), number of channels (1-32), and carrier type (sine-wave and noise-band) on brainstem neural representation of fundamental frequency (<i>f</i> <sub>o</sub>) in frequency-following responses (FFRs) to vocoded vowels of 15 young adult listeners with normal hearing.</p><p><strong>Results: </strong>Results showed that FFR <i>f</i> <sub>o</sub> strength (quantified as absolute <i>f</i> <sub>o</sub> magnitude divided by noise floor [NF] magnitude) significantly improved with 500-Hz vs. 50-Hz temporal envelopes for all channel numbers and both carriers except the 1-channel noise-band vocoder. FFR <i>f</i> <sub>o</sub> strength with 500-Hz temporal envelopes significantly improved when the channel number increased from 1 to 2, but it either declined (sine-wave vocoders) or saturated (noise-band vocoders) when the channel number increased from 4 to 32. FFR <i>f</i> <sub>o</sub> strength with 50-Hz temporal envelopes was similarly small for both carriers with all channel numbers, except for a significant improvement with the 16-channel sine-wave vocoder. With 500-Hz temporal envelopes, FFR <i>f</i> <sub>o</sub> strength was significantly greater for sine-wave vocoders than for noise-band vocoders with channel numbers 1-8; no significant differences were seen with 16 and 32 channels. With 50-Hz temporal envelopes, the carrier effect was only observed with 16 channels. In contrast, there was no significant carrier effect for the absolute <i>f</i> <sub>o</sub> magnitude. Compared to sine-wave vocoders, noise-band vocoders had a higher NF and thus lower relative FFR <i>f</i> <sub>o</sub> strength.</p><p><strong>Conclusions: </strong>It is important to normalize the <i>f</i> <sub>o</sub> magnitude relative to the NF when analyzing the FFRs to vocoded speech. The physiological findings reported here may result from the availability of <i>f</i> <sub>o</sub>-related temporal periodicity and spectral sidelobes in vocoded signals and should be considered when selecting vocoder parameters and interpreting results in future physiological studies. In general, the dependence of brainstem neural phase-locking strength to <i>f</i> <sub>o</sub> on vocoder parameters may confound the comparison of pitch-related behavioral results across different vocoder designs.</p>","PeriodicalId":520690,"journal":{"name":"Journal of speech, language, and hearing research : JSLHR","volume":" ","pages":"3146-3164"},"PeriodicalIF":2.2000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of speech, language, and hearing research : JSLHR","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1044/2022_JSLHR-21-00576","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/8/9 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Purpose: The objective of this study was to determine if and how the subcortical neural representation of pitch cues in listeners with normal hearing is affected by systematic manipulation of vocoder parameters.

Method: This study assessed the effects of temporal envelope cutoff frequency (50 and 500 Hz), number of channels (1-32), and carrier type (sine-wave and noise-band) on brainstem neural representation of fundamental frequency (f o) in frequency-following responses (FFRs) to vocoded vowels of 15 young adult listeners with normal hearing.

Results: Results showed that FFR f o strength (quantified as absolute f o magnitude divided by noise floor [NF] magnitude) significantly improved with 500-Hz vs. 50-Hz temporal envelopes for all channel numbers and both carriers except the 1-channel noise-band vocoder. FFR f o strength with 500-Hz temporal envelopes significantly improved when the channel number increased from 1 to 2, but it either declined (sine-wave vocoders) or saturated (noise-band vocoders) when the channel number increased from 4 to 32. FFR f o strength with 50-Hz temporal envelopes was similarly small for both carriers with all channel numbers, except for a significant improvement with the 16-channel sine-wave vocoder. With 500-Hz temporal envelopes, FFR f o strength was significantly greater for sine-wave vocoders than for noise-band vocoders with channel numbers 1-8; no significant differences were seen with 16 and 32 channels. With 50-Hz temporal envelopes, the carrier effect was only observed with 16 channels. In contrast, there was no significant carrier effect for the absolute f o magnitude. Compared to sine-wave vocoders, noise-band vocoders had a higher NF and thus lower relative FFR f o strength.

Conclusions: It is important to normalize the f o magnitude relative to the NF when analyzing the FFRs to vocoded speech. The physiological findings reported here may result from the availability of f o-related temporal periodicity and spectral sidelobes in vocoded signals and should be considered when selecting vocoder parameters and interpreting results in future physiological studies. In general, the dependence of brainstem neural phase-locking strength to f o on vocoder parameters may confound the comparison of pitch-related behavioral results across different vocoder designs.

时间包络截止频率、通道数和载波类型对语音编码中音调脑干神经表征的影响。
目的:本研究的目的是确定正常听力听者的皮质下神经表征是否以及如何受到声码器参数的系统操纵的影响。方法:本研究评估了时间包络截止频率(50和500 Hz)、通道数(1-32)和载波类型(正弦波和噪声带)对15名听力正常的年轻成年听众对元音编码的频率跟随反应(FFRs)中基频(f)脑干神经表征的影响。结果表明,除1通道噪声带声编码器外,所有信道号和两种载波的时域包络在500 hz与50 hz的情况下,FFR强度(量化为绝对噪声量级除以噪声底[NF]量级)显著提高。当通道数从1增加到2时,500 hz时间包络的FFR强度显著提高,但当通道数从4增加到32时,FFR强度要么下降(正弦波声码器),要么饱和(噪声带声码器)。除了16通道正弦波声码器的显著改进外,50 hz时间包络的FFR强度对于所有信道数的两个载波都同样小。在500 hz的时间包络中,正弦波声码器的FFR强度显著大于通道号为1-8的噪声带声码器;16通道和32通道无明显差异。在50 hz的时间包络中,仅在16个通道中观察到载波效应。相比之下,对于绝对值没有显著的载体效应。与正弦波声码器相比,噪声带声码器具有更高的NF,因此相对FFR强度更低。结论:在分析语音编码语音的ffr时,对相对于NF的f值进行归一化是很重要的。本文报道的生理学发现可能是由于声编码信号中存在与f相关的时间周期性和频谱副瓣,在未来的生理学研究中,在选择声码器参数和解释结果时应予以考虑。一般来说,脑干神经锁相强度对f的依赖于声码器参数可能会混淆不同声码器设计中音高相关行为结果的比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信