JASA express letters最新文献

筛选
英文 中文
Direct articulatory observation reveals phoneme recognition performance characteristics of a self-supervised speech model. 直接发音观察揭示了自监督语音模型的音素识别性能特征。
IF 1.2
JASA express letters Pub Date : 2024-11-01 DOI: 10.1121/10.0034430
Xuan Shi, Tiantian Feng, Kevin Huang, Sudarsana Reddy Kadiri, Jihwan Lee, Yijing Lu, Yubin Zhang, Louis Goldstein, Shrikanth Narayanan
{"title":"Direct articulatory observation reveals phoneme recognition performance characteristics of a self-supervised speech model.","authors":"Xuan Shi, Tiantian Feng, Kevin Huang, Sudarsana Reddy Kadiri, Jihwan Lee, Yijing Lu, Yubin Zhang, Louis Goldstein, Shrikanth Narayanan","doi":"10.1121/10.0034430","DOIUrl":"https://doi.org/10.1121/10.0034430","url":null,"abstract":"<p><p>Variability in speech pronunciation is widely observed across different linguistic backgrounds, which impacts modern automatic speech recognition performance. Here, we evaluate the performance of a self-supervised speech model in phoneme recognition using direct articulatory evidence. Findings indicate significant differences in phoneme recognition, especially in front vowels, between American English and Indian English speakers. To gain a deeper understanding of these differences, we conduct real-time MRI-based articulatory analysis, revealing distinct velar region patterns during the production of specific front vowels. This underscores the need to deepen the scientific understanding of self-supervised speech model variances to advance robust and inclusive speech technology.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ambient noise source characterization using spectral, coherence, and directionality estimates at Kongsfjorden. 在康斯峡湾利用频谱、相干性和方向性估计进行环境噪声源鉴定。
IF 1.2
JASA express letters Pub Date : 2024-11-01 DOI: 10.1121/10.0034307
Sanjana M C, Latha G, Thirunavukkarasu A
{"title":"Ambient noise source characterization using spectral, coherence, and directionality estimates at Kongsfjorden.","authors":"Sanjana M C, Latha G, Thirunavukkarasu A","doi":"10.1121/10.0034307","DOIUrl":"https://doi.org/10.1121/10.0034307","url":null,"abstract":"<p><p>Ambient noise measurements from an Arctic fjord during summer and winter are analyzed using spectral, coherence, and directionality estimates from a vertically separated pair of hydrophones. The primary noise sources attributed to wind, shipping, and ice activity are categorized and coherence is arrived at. Estimates of the noise field directionality in the vertical and its variation over time and between seasons are used to strengthen the analysis of the time-varying nature of noise sources. Source identification using such processing techniques serves as a valuable tool in passive acoustic monitoring systems for studying ice dynamics in glacierized fjords.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An experimental approach for comparing the influence of cello string type on bowed attack response. 比较大提琴琴弦类型对弓弦攻击响应影响的实验方法。
IF 1.2
JASA express letters Pub Date : 2024-11-01 DOI: 10.1121/10.0034330
Alessio Lampis, Alexander Mayer, Vasileios Chatziioannou
{"title":"An experimental approach for comparing the influence of cello string type on bowed attack response.","authors":"Alessio Lampis, Alexander Mayer, Vasileios Chatziioannou","doi":"10.1121/10.0034330","DOIUrl":"https://doi.org/10.1121/10.0034330","url":null,"abstract":"<p><p>This study investigates the influence of string properties on bowed string attack playability. To assess the attack playability of different string types, a variety of bow forces and bow accelerations were chosen to excite the strings and measure the transient response under different bowing control parameters. The experimentally obtained playability maps of transient duration as function of bow force and acceleration (Guettler diagram) were obtained with a robotic bowing machine, from four different types of cello G2 strings. Results indicate variations in playability across string types, suggesting that string properties impact attack duration.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142607745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech recognition in adverse conditions by humans and machines. 人类和机器在恶劣条件下的语音识别。
IF 1.2
JASA express letters Pub Date : 2024-11-01 DOI: 10.1121/10.0032473
Chloe Patman, Eleanor Chodroff
{"title":"Speech recognition in adverse conditions by humans and machines.","authors":"Chloe Patman, Eleanor Chodroff","doi":"10.1121/10.0032473","DOIUrl":"https://doi.org/10.1121/10.0032473","url":null,"abstract":"<p><p>In the development of automatic speech recognition systems, achieving human-like performance has been a long-held goal. Recent releases of large spoken language models have claimed to achieve such performance, although direct comparison to humans has been severely limited. The present study tested L1 British English listeners against two automatic speech recognition systems (wav2vec 2.0 and Whisper, base and large sizes) in adverse listening conditions: speech-shaped noise and pub noise, at different signal-to-noise ratios, and recordings produced with or without face masks. Humans maintained the advantage against all systems, except for Whisper large, which outperformed humans in every condition but pub noise.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142633969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Starship super heavy acoustics: Far-field noise measurements during launch and the first-ever booster catch. 星际飞船超重型声学:发射和首次接住助推器期间的远场噪声测量。
IF 1.2
JASA express letters Pub Date : 2024-11-01 DOI: 10.1121/10.0034453
Kent L Gee, Noah L Pulsipher, Makayle S Kellison, Logan T Mathews, Mark C Anderson, Grant W Hart
{"title":"Starship super heavy acoustics: Far-field noise measurements during launch and the first-ever booster catch.","authors":"Kent L Gee, Noah L Pulsipher, Makayle S Kellison, Logan T Mathews, Mark C Anderson, Grant W Hart","doi":"10.1121/10.0034453","DOIUrl":"10.1121/10.0034453","url":null,"abstract":"<p><p>Far-field (9.7-35.5 km) noise measurements were made during the fifth flight test of SpaceX's Starship Super Heavy, which included the first-ever booster catch. Key results involving launch and flyback sonic boom sound levels include (a) A-weighted sound exposure levels during launch are 18 dB less than predicted at 35 km; (b) the flyback sonic boom exceeds 10 psf at 10 km; and (c) comparing Starship launch noise to Space Launch System and Falcon 9 shows that Starship is substantially louder; the far-field noise produced during a Starship launch is at least ten times that of Falcon 9.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142640165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speaker adaptation using codebook integrated deep neural networks for speech enhancement. 利用代码集集成深度神经网络进行语音增强的扬声器适配。
IF 1.2
JASA express letters Pub Date : 2024-11-01 DOI: 10.1121/10.0034308
B Chidambar, D Hanumanth Rao Naidu
{"title":"Speaker adaptation using codebook integrated deep neural networks for speech enhancement.","authors":"B Chidambar, D Hanumanth Rao Naidu","doi":"10.1121/10.0034308","DOIUrl":"https://doi.org/10.1121/10.0034308","url":null,"abstract":"<p><p>Deep neural network (DNN) based speech enhancement techniques have shown superior performance compared to the traditional speech enhancement approaches in handling nonstationary noise. However, their performance is often compromised as a result of mismatch between their testing and training conditions. In this work, a codebook integrated deep neural network (CI-DNN) approach is introduced for speech enhancement, which mitigates this mismatch by employing existing speaker adapted codebooks with a DNN. The proposed CI-DNN demonstrates better speech enhancement performance compared to the corresponding speaker independent DNNs. The CI-DNN approach essentially involves a post processing operation for DNN and, hence, is applicable to any DNN architecture.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The perceptual distinctiveness of the [n-l] contrast in different vowel and tonal contexts. 不同元音和音调语境中 [n-l] 对比的感知独特性。
IF 1.2
JASA express letters Pub Date : 2024-11-01 DOI: 10.1121/10.0034196
Pauline Bolin Liu, Mingxing Li
{"title":"The perceptual distinctiveness of the [n-l] contrast in different vowel and tonal contexts.","authors":"Pauline Bolin Liu, Mingxing Li","doi":"10.1121/10.0034196","DOIUrl":"https://doi.org/10.1121/10.0034196","url":null,"abstract":"<p><p>This study investigates the relative perceptual distinction of the [n] vs [l] contrast in different vowel contexts ([_a] vs [_i]) and tonal contexts (high-initial such as HH, HL, vs low-initial such as LL, LH). The results of two speeded AX discrimination experiments indicated that a [n-l] contrast is perceptually more distinct in the [_a] context and with a high-initial tone. The results are consistent with the typology of the [n] vs [l] contrast across Chinese dialects, which is more frequently observed in the [_a] context and with a high-initial tone, supporting a connection between phonological typology and perceptual distinctiveness.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fundamental frequency predominantly drives talker differences in auditory brainstem responses to continuous speech. 基频主要驱动说话者对连续语音的听觉脑干反应差异。
IF 1.2
JASA express letters Pub Date : 2024-11-01 DOI: 10.1121/10.0034329
Melissa J Polonenko, Ross K Maddox
{"title":"Fundamental frequency predominantly drives talker differences in auditory brainstem responses to continuous speech.","authors":"Melissa J Polonenko, Ross K Maddox","doi":"10.1121/10.0034329","DOIUrl":"10.1121/10.0034329","url":null,"abstract":"<p><p>Deriving human neural responses to natural speech is now possible, but the responses to male- and female-uttered speech have been shown to differ. These talker differences may complicate interpretations or restrict experimental designs geared toward more realistic communication scenarios. This study found that when a male talker and a female talker had the same fundamental frequency, auditory brainstem responses (ABRs) were very similar. Those responses became smaller and later with increasing fundamental frequency, as did click ABRs with increasing stimulus rates. Modeled responses suggested that the speech and click ABR differences were reasonably predicted by peripheral and brainstem processing of stimulus acoustics.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11558516/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating audio quality ratings and scene analysis performance of hearing-impaired listeners for multi-track music. 评估听力受损听众对多轨音乐的音频质量评级和场景分析性能。
IF 1.2
JASA express letters Pub Date : 2024-11-01 DOI: 10.1121/10.0032474
Aravindan Joseph Benjamin, Kai Siedenburg
{"title":"Evaluating audio quality ratings and scene analysis performance of hearing-impaired listeners for multi-track music.","authors":"Aravindan Joseph Benjamin, Kai Siedenburg","doi":"10.1121/10.0032474","DOIUrl":"https://doi.org/10.1121/10.0032474","url":null,"abstract":"<p><p>This study assessed musical scene analysis (MSA) performance and subjective quality ratings of multi-track mixes as a function of spectral manipulations using the EQ-transform (% EQT). This transform exaggerates or reduces the spectral shape changes in a given track with respect to a relatively flat, smooth reference spectrum. Data from 30 younger normal hearing (yNH) and 23 older hearing-impaired (oHI) participants showed that MSA performance was robust to changes in % EQT. However, audio quality ratings elicited from yNH participants were more sensitive to % EQT than those of oHI participants. A significant positive correlation between MSA performance and quality ratings among oHI showed that oHI participants with better MSA performances gave higher-quality ratings, whereas there was no significant correlation for yNH listeners. Overall, these data indicate the complementary virtue of measures of MSA and audio quality ratings for assessing the suitability of music mixes for hearing-impaired listeners.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142633867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-resolution acoustically informed maps of sound speed. 高分辨率声学声速图。
IF 1.2
JASA express letters Pub Date : 2024-10-01 DOI: 10.1121/10.0032475
Scott Loranger, Brendan DeCourcy, Weifeng Gordon Zhang, Ying-Tsong Lin, Andone Lavery
{"title":"High-resolution acoustically informed maps of sound speed.","authors":"Scott Loranger, Brendan DeCourcy, Weifeng Gordon Zhang, Ying-Tsong Lin, Andone Lavery","doi":"10.1121/10.0032475","DOIUrl":"https://doi.org/10.1121/10.0032475","url":null,"abstract":"<p><p>As oceanographic models advance in complexity, accuracy, and resolution, in situ measurements must provide spatiotemporal information with sufficient resolution to inform and validate those models. In this study, water masses at the New England shelf break were mapped using scientific echosounders combined with water column property measurements from a single conductivity, temperature, and depth (CTD) profile. The acoustically-inferred map of sound speed was compared with a sound speed cross section based on two-dimensional interpolation of multiple CTD profiles. Long-range acoustic propagation models were then parameterized by the sound speed profiles estimated by the two methods and differences were compared.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 10","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信