Natarajan Balaji Shankar, Amber Afshan, Alexander Johnson, Aurosweta Mahapatra, Alejandra Martin, Haolun Ni, Hae Won Park, Marlen Quintero Perez, Gary Yeung, Alison Bailey, Cynthia Breazeal, Abeer Alwan
{"title":"The JIBO Kids Corpus: A speech dataset of child-robot interactions in a classroom environment.","authors":"Natarajan Balaji Shankar, Amber Afshan, Alexander Johnson, Aurosweta Mahapatra, Alejandra Martin, Haolun Ni, Hae Won Park, Marlen Quintero Perez, Gary Yeung, Alison Bailey, Cynthia Breazeal, Abeer Alwan","doi":"10.1121/10.0034195","DOIUrl":"https://doi.org/10.1121/10.0034195","url":null,"abstract":"<p><p>This paper describes an original dataset of children's speech, collected through the use of JIBO, a social robot. The dataset encompasses recordings from 110 children, aged 4-7 years old, who participated in a letter and digit identification task and extended oral discourse tasks requiring explanation skills, totaling 21 h of session data. Spanning a 2-year collection period, this dataset contains a longitudinal component with a subset of participants returning for repeat recordings. The dataset, with session recordings and transcriptions, is publicly available, providing researchers with a valuable resource to advance investigations into child language development.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ambient noise source characterization using spectral, coherence, and directionality estimates at Kongsfjorden.","authors":"Sanjana M C, Latha G, Thirunavukkarasu A","doi":"10.1121/10.0034307","DOIUrl":"https://doi.org/10.1121/10.0034307","url":null,"abstract":"<p><p>Ambient noise measurements from an Arctic fjord during summer and winter are analyzed using spectral, coherence, and directionality estimates from a vertically separated pair of hydrophones. The primary noise sources attributed to wind, shipping, and ice activity are categorized and coherence is arrived at. Estimates of the noise field directionality in the vertical and its variation over time and between seasons are used to strengthen the analysis of the time-varying nature of noise sources. Source identification using such processing techniques serves as a valuable tool in passive acoustic monitoring systems for studying ice dynamics in glacierized fjords.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessio Lampis, Alexander Mayer, Vasileios Chatziioannou
{"title":"An experimental approach for comparing the influence of cello string type on bowed attack response.","authors":"Alessio Lampis, Alexander Mayer, Vasileios Chatziioannou","doi":"10.1121/10.0034330","DOIUrl":"https://doi.org/10.1121/10.0034330","url":null,"abstract":"<p><p>This study investigates the influence of string properties on bowed string attack playability. To assess the attack playability of different string types, a variety of bow forces and bow accelerations were chosen to excite the strings and measure the transient response under different bowing control parameters. The experimentally obtained playability maps of transient duration as function of bow force and acceleration (Guettler diagram) were obtained with a robotic bowing machine, from four different types of cello G2 strings. Results indicate variations in playability across string types, suggesting that string properties impact attack duration.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142607745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speaker adaptation using codebook integrated deep neural networks for speech enhancement.","authors":"B Chidambar, D Hanumanth Rao Naidu","doi":"10.1121/10.0034308","DOIUrl":"https://doi.org/10.1121/10.0034308","url":null,"abstract":"<p><p>Deep neural network (DNN) based speech enhancement techniques have shown superior performance compared to the traditional speech enhancement approaches in handling nonstationary noise. However, their performance is often compromised as a result of mismatch between their testing and training conditions. In this work, a codebook integrated deep neural network (CI-DNN) approach is introduced for speech enhancement, which mitigates this mismatch by employing existing speaker adapted codebooks with a DNN. The proposed CI-DNN demonstrates better speech enhancement performance compared to the corresponding speaker independent DNNs. The CI-DNN approach essentially involves a post processing operation for DNN and, hence, is applicable to any DNN architecture.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The perceptual distinctiveness of the [n-l] contrast in different vowel and tonal contexts.","authors":"Pauline Bolin Liu, Mingxing Li","doi":"10.1121/10.0034196","DOIUrl":"https://doi.org/10.1121/10.0034196","url":null,"abstract":"<p><p>This study investigates the relative perceptual distinction of the [n] vs [l] contrast in different vowel contexts ([_a] vs [_i]) and tonal contexts (high-initial such as HH, HL, vs low-initial such as LL, LH). The results of two speeded AX discrimination experiments indicated that a [n-l] contrast is perceptually more distinct in the [_a] context and with a high-initial tone. The results are consistent with the typology of the [n] vs [l] contrast across Chinese dialects, which is more frequently observed in the [_a] context and with a high-initial tone, supporting a connection between phonological typology and perceptual distinctiveness.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fundamental frequency predominantly drives talker differences in auditory brainstem responses to continuous speech.","authors":"Melissa J Polonenko, Ross K Maddox","doi":"10.1121/10.0034329","DOIUrl":"10.1121/10.0034329","url":null,"abstract":"<p><p>Deriving human neural responses to natural speech is now possible, but the responses to male- and female-uttered speech have been shown to differ. These talker differences may complicate interpretations or restrict experimental designs geared toward more realistic communication scenarios. This study found that when a male talker and a female talker had the same fundamental frequency, auditory brainstem responses (ABRs) were very similar. Those responses became smaller and later with increasing fundamental frequency, as did click ABRs with increasing stimulus rates. Modeled responses suggested that the speech and click ABR differences were reasonably predicted by peripheral and brainstem processing of stimulus acoustics.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scott Loranger, Brendan DeCourcy, Weifeng Gordon Zhang, Ying-Tsong Lin, Andone Lavery
{"title":"High-resolution acoustically informed maps of sound speed.","authors":"Scott Loranger, Brendan DeCourcy, Weifeng Gordon Zhang, Ying-Tsong Lin, Andone Lavery","doi":"10.1121/10.0032475","DOIUrl":"https://doi.org/10.1121/10.0032475","url":null,"abstract":"<p><p>As oceanographic models advance in complexity, accuracy, and resolution, in situ measurements must provide spatiotemporal information with sufficient resolution to inform and validate those models. In this study, water masses at the New England shelf break were mapped using scientific echosounders combined with water column property measurements from a single conductivity, temperature, and depth (CTD) profile. The acoustically-inferred map of sound speed was compared with a sound speed cross section based on two-dimensional interpolation of multiple CTD profiles. Long-range acoustic propagation models were then parameterized by the sound speed profiles estimated by the two methods and differences were compared.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nadège Aoki, Benjamin Weiss, Youenn Jézéquel, Amy Apprill, T Aran Mooney
{"title":"Replayed reef sounds induce settlement of Favia fragum coral larvae in aquaria and field environmentsa).","authors":"Nadège Aoki, Benjamin Weiss, Youenn Jézéquel, Amy Apprill, T Aran Mooney","doi":"10.1121/10.0032407","DOIUrl":"https://doi.org/10.1121/10.0032407","url":null,"abstract":"<p><p>Acoustic cues of healthy reefs are known to support critical settlement behaviors for one reef-building coral, but acoustic responses have not been demonstrated in additional species. Settlement of Favia fragum larvae in response to replayed coral reef soundscapes were observed by exposing larvae in aquaria and reef settings to playback sound treatments for 24-72 h. Settlement increased under 24 h sound treatments in both experiments. The results add to growing knowledge that acoustically mediated settlement may be widespread among stony corals with species-specific attributes, suggesting sound could be one tool employed to rehabilitate and build resilience within imperiled reef communities.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beam-space spatial spectrum reconstruction under unknown stationary near-field interference: Algorithm design and experimental verification.","authors":"Jichen Chu, Lei Cheng, Wen Xu","doi":"10.1121/10.0030334","DOIUrl":"https://doi.org/10.1121/10.0030334","url":null,"abstract":"<p><p>In acoustic array signal processing, spatial spectrum estimation and the corresponding direction-of-arrival estimation are sometimes affected by stationary near-field interferences, presenting a considerable challenge for the target detection. To address the challenge, this paper proposes a beam-space spatial spectrum reconstruction algorithm. The proposed algorithm overcomes the limitations of common spatial spectrum estimation algorithms designed for near-field interference scenarios, which require knowledge of the near-field interference array manifold. The robustness and efficacy of the proposed algorithm under strong stationary near-field interference are confirmed through the analysis of simulated and real-life experimental data.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142360734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tests of human auditory temporal resolution: Simulations of Bayesian threshold estimation for auditory gap detection.","authors":"Shuji Mori, Yuto Murata, Takashi Morimoto, Yasuhide Okamoto, Sho Kanzaki","doi":"10.1121/10.0028501","DOIUrl":"10.1121/10.0028501","url":null,"abstract":"<p><p>In an attempt to develop tests of auditory temporal resolution using gap detection, we conducted computer simulations of Zippy Estimation by Sequential Testing (ZEST), an adaptive Bayesian threshold estimation procedure, for measuring gap detection thresholds. The results showed that the measures of efficiency and precision of ZEST changed with the mean and standard deviation (SD) of the initial probability density function implemented in ZEST. Appropriate combinations of mean and SD values led to efficient ZEST performance; i.e., the threshold estimates converged to their true values after 10 to 15 trials.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}