{"title":"A data augmentation approach for an automatic speech recognition system using laser Doppler vibrometer technology.","authors":"Ji-Yan Han, Po-Hsun Huang, Ruei-Ci Shen, Cheng-Yang Liu, Lieber Po-Hung Li, An-Suey Shiao, Ying-Hui Lai","doi":"10.1121/10.0036753","DOIUrl":"https://doi.org/10.1121/10.0036753","url":null,"abstract":"<p><p>In challenging conditions such as low signal-to-noise ratios and distant speech, microphone-based automatic speech recognition (ASR) struggles with clarity. To remedy this, laser Doppler vibrometer (LDV) technology is integrated into the ASR system and a data augmentation approach is employed to generate training data containing LDV attributes. The performance of the ASR, assessed using word error rates, showed superior results with the data augmentation approach compared to the baseline ASR system trained solely on real LDV data. Thus, with the aid of data augmentation, LDV can potentially be a sound-capturing device for ASR, offering valuable insights for future applications.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 5","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144129708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Word frequency effects on L2 learners' phonetic imitations.","authors":"Daiki Hashimoto, Akane Ida, Hinako Jozuka","doi":"10.1121/10.0036512","DOIUrl":"https://doi.org/10.1121/10.0036512","url":null,"abstract":"<p><p>Word frequency plays an important role in a variety of phonetic phenomena. One of the well-known observations is that low-frequency words exhibit more phonetic imitation than high-frequency words. The previous studies made this observation by exploring L1 phonetic imitation, and the current study extended the findings to L2 learners' phonetic imitations. Thirty Japanese English learners participated in this research and shadowed American English model speech stimuli. The linear combination analyses suggested that low-frequency words show a stronger imitation effect in relation to Bark-scaled F1 values. This finding is discussed in terms of implications for mental representations in the L2 lexicon.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 5","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144054790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Frequency-difference sparse Bayesian learning for unambiguous direction-of-arrival estimation.","authors":"Ze Yuan, Haiqiang Niu, Zhenglin Li, Wenyu Luo","doi":"10.1121/10.0036752","DOIUrl":"https://doi.org/10.1121/10.0036752","url":null,"abstract":"<p><p>The frequency-difference (FD) method uses the FD Hadamard product, comprising auto-products to model below-band acoustic fields and unintended cross-products, for efficient direction-of-arrival (DOA) estimation under spatial aliasing. Despite improved resolution from compressive sensing, spurious peaks arise as a result of cross-products lacking counterparts in the sensing matrix. The proposed method addresses this by reconstructing the sensing matrix with the full Hadamard product and applying sparse Bayesian learning to estimate a two-dimensional hyperparameter matrix, extracting its diagonal to suppress spurious DOAs. Simulations show that it outperforms previous compressive FD methods in detecting weak targets, where advantages increase as source numbers grow.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 5","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144043923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brian Polagye, Aidan Hunt, Landon Mackey, Christopher Bassett
{"title":"Approaches to attributing underwater noise to a wave energy converter.","authors":"Brian Polagye, Aidan Hunt, Landon Mackey, Christopher Bassett","doi":"10.1121/10.0036727","DOIUrl":"https://doi.org/10.1121/10.0036727","url":null,"abstract":"<p><p>Radiated noise from marine energy harvesting is of environmental and engineering interest. Here, drifting hydrophones measure underwater noise in the vicinity of a relatively small wave energy converter. A statistical approach is demonstrated for attributing range-dependent, commonly occurring sounds in the frequency band from 90 to 600 Hz. Time-delay-of-arrival localization is then demonstrated for attribution of individual acoustic events likely associated with the power takeoff and wave-hull interactions. Because the radiated noise from the wave energy converter falls below ambient levels at a range of approximately 150 m, it is unlikely to substantially affect marine life at greater distance.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 5","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144082637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Temporal coherence effects on voice attribution in multi-speaker stream segregation.","authors":"Jaeeun Lee, Andrew J Oxenham","doi":"10.1121/10.0036672","DOIUrl":"10.1121/10.0036672","url":null,"abstract":"<p><p>The principle of temporal coherence predicts that two temporally coherent voices should form a unified auditory stream, whereas incoherent voices should form separate streams. This prediction was tested by asking 20 normal-hearing listeners to identify the last word spoken by the higher or lower of two talkers, preceded by temporally coherent or incoherent phrases spoken by the same two talkers, or by silence. In contrast to results from stream-segregation studies using simple repeating stimuli that manipulated temporal coherence, no significant differences in performance were observed between the conditions, raising questions regarding the generalization of temporal-coherence principles to complex speech.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 5","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12077373/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144063419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaohan Wang, Yang Dong, Shengchun Piao, Kashif Iqbal
{"title":"Observation of forward-direct onshore and backscattered waves in water propagating in a wedge-shaped ocean.","authors":"Xiaohan Wang, Yang Dong, Shengchun Piao, Kashif Iqbal","doi":"10.1121/10.0036671","DOIUrl":"https://doi.org/10.1121/10.0036671","url":null,"abstract":"<p><p>To investigate low-frequency acoustic propagation in a wedge-shaped ocean, two ocean bottom seismometers (OBSs) were deployed on the seabed and a land-based seismometer (LS) was positioned near the coastline. A broadband acoustic source (airgun) generated signals at a standoff distance. The OBSs captured forward-direct waves and backscattered waves, while the LS detected shore-coupled forward-direct arrivals. Spectral element modeling revealed a frequency-dependent propagation mechanism: High-frequency components (>200 Hz) of the normal modes exhibited strong backscattering from the seabed slope, while low-frequency components (<200 Hz) of the first normal mode coupled into the seabed sediment and propagated onshore as geoacoustic waves.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 5","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144048009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michail Vourakis, Franz Zotter, Eric Brandão, Elias Zea
{"title":"Aeroacoustic source characterization at fan test facility with spherical harmonics of the half-space.","authors":"Michail Vourakis, Franz Zotter, Eric Brandão, Elias Zea","doi":"10.1121/10.0036723","DOIUrl":"https://doi.org/10.1121/10.0036723","url":null,"abstract":"<p><p>Acoustic measurements of sources in non-ideal acoustic environments, often the case in industrial product development, issue challenges in source characterization. This study investigates the room-acoustical effects of a bespoke fan test facility on aeroacoustic source characterization via a second-order scheme of spherical harmonics of the half-space. An experimental test of a compact monopole-like sound source reveals the influence of the room's transfer function at low frequencies. Applying the scheme to a benchmark case of a low-pressure axial fan at different loading conditions showcases a satisfactory estimation of sound power and directivity.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 5","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144082636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Priyabrata Saha, Richard X Touret, Etienne Ollivier, Jihui Jin, Matthew McKinley, Justin Romberg, Karim G Sabra
{"title":"Leveraging sound speed dynamics and generative deep learning for ray-based ocean acoustic tomography.","authors":"Priyabrata Saha, Richard X Touret, Etienne Ollivier, Jihui Jin, Matthew McKinley, Justin Romberg, Karim G Sabra","doi":"10.1121/10.0036312","DOIUrl":"https://doi.org/10.1121/10.0036312","url":null,"abstract":"<p><p>A generative deep learning framework is introduced for ray-based ocean acoustic tomography (OAT), an inverse problem for estimating sound speed profiles (SSP) based on arrival-times measurements between multiple acoustic transducers, which is typically ill-posed. This framework relies on a robust low-dimensional parametrization of the expected SSP variations using a variational autoencoder and a linear dynamical model as further regularization. This framework was tested using SSP variations simulated by a regional ocean model with submesoscale permitting horizontal resolution and various transducer configurations spanning the upper ocean over short propagation ranges and was found to outperform conventional linear least squares formulations of OAT.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 4","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143756277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Léa Bouffaut, Quentin Goestchel, Robin André Rørstadbotnen, Anthony Sladen, Arthur Hartog, Holger Klinck
{"title":"Estimating sound pressure levels from distributed acoustic sensing data using 20 Hz fin whale calls.","authors":"Léa Bouffaut, Quentin Goestchel, Robin André Rørstadbotnen, Anthony Sladen, Arthur Hartog, Holger Klinck","doi":"10.1121/10.0036351","DOIUrl":"https://doi.org/10.1121/10.0036351","url":null,"abstract":"<p><p>Distributed acoustic sensing (DAS) is a promising technology for underwater acoustics, but its instrumental response is still being investigated to enable quantitative measurements. We use fin whale 20 Hz calls to estimate the conversion between DAS-recorded strain and acoustic pressure. Our method is tested across three deployments on varied seafloor telecommunication cables and ocean basins. Results show that after accounting for well-established DAS response factors, a unique value for water compressibility provides a good estimate for the conversion. This work represents a significant step forward in characterizing DAS for marine monitoring and highlights potential limitations related to instrument noise floor.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 4","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hunter J Pratt, Logan T Mathews, Tyce W Olaveson, Kent L Gee
{"title":"Sound power level spectra of an installed General Electric F404 engine.","authors":"Hunter J Pratt, Logan T Mathews, Tyce W Olaveson, Kent L Gee","doi":"10.1121/10.0036464","DOIUrl":"https://doi.org/10.1121/10.0036464","url":null,"abstract":"<p><p>A sound power spectrum analysis has been conducted on a T-7A-installed F404 engine, for operating conditions spanning intermediate thrust to afterburner. From free-field pressure spectra at microphone arc arrays with radii of 38 and 76 m, sound power level spectra are calculated from surface integrals and assumed axisymmetric radiation. The spectral peak-frequency region, from ∼100-500 Hz, broadens with increasing engine conditions. When the power level spectra are plotted with Strouhal number, the spectral peak decreases with engine condition. Comparing this decrease with rocket data suggests that military jet noise radiation is becoming more rocket-like, especially at afterburner conditions.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 4","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144044819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}