Ziqi Huang, Dominik Ochs, M Clara P Amorim, Paulo J Fonseca, Mayank Goel, Nuno Jardim Nunes, Manuel Vieira, Manuel Lopes
{"title":"Deep learning-based frameworks for the detection and classification of soniferous fish.","authors":"Ziqi Huang, Dominik Ochs, M Clara P Amorim, Paulo J Fonseca, Mayank Goel, Nuno Jardim Nunes, Manuel Vieira, Manuel Lopes","doi":"10.1121/10.0038800","DOIUrl":"10.1121/10.0038800","url":null,"abstract":"<p><p>Passive acoustic monitoring (PAM) is emerging as a valuable tool for assessing fish populations in natural habitats. This study compares two deep learning-based frameworks: (1) a multi-label segmentation-based classification system (SegClas) combining convolutional neural networks and long short term memory networks and, (2) an object detection approach (ObjDet) using a You Only Look Once based model to detect, classify, and count sounds produced by soniferous fish in the Tagus Estuary, Portugal. The target species-Lusitanian toadfish (Halobatrachus didactylus), meagre (Argyrosomus regius), and weakfish (Cynoscion regalis)-exhibit overlapping vocalization patterns, posing classification challenges. Results show both methods achieve high accuracy (over 96%) and F1 scores above 87% for species-level sound identification, demonstrating their effectiveness under varied noise conditions. ObjDet generally offers slightly higher classification performance (F1 up to 92%) and can annotate each vocalization for more precise counting. However, it requires bounding-box annotations and higher computational costs (inference time of ca. 1.95 s/h of recording). In contrast, SegClas relies on segment-level labels and provides faster inference (ca. 1.46 s/h). This study also compares both counting strategies, each offering distinct advantages for different ecological and operational needs. Our results highlight the potential of deep learning-based PAM for fish population assessment.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"1060-1071"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144835480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Annoyance of interior noise in high-speed trains: Determining acceptability thresholds for different passenger activities.","authors":"Guillaume Lemaitre, Fabrice Aubin, Christophe Lambourg, Catherine Lavandier","doi":"10.1121/10.0038979","DOIUrl":"10.1121/10.0038979","url":null,"abstract":"<p><p>One of the main advantages of railway transportation is that it allows passengers to engage in a variety of activities. Thus, train operators and manufacturers seek to guarantee that the train background noise does not prevent passengers from conducting these activities. The goal of this work was to investigate the effects of background noise on passengers' perception and activities and to determine thresholds of acceptability for these activities. It combined two approaches. In the first experiment, participants rated short-term annoyance elicited by recordings of train interior noises on a continuous scale (psychoacoustic approach). The results of a multilinear regression analysis showed that loudness and emergent tonalities were the most important drivers of annoyance. In the second experiment, participants were seated in a mock-up of a high-speed train and performed tasks common to first- and second-class railway passengers (watching a TV series and reading a text), with different background noises (cognitive activities approach). Interestingly, the results showed that in the second approach (i.e., when their attention was drawn away from the sounds), only the loudness had an impact. Overall, combining these two approaches yielded an analysis of the factors driving annoyance and a definition of thresholds for different on-board activities.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"1189-1203"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144873782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jose Rendón-Arredondo, Emma Vella, Andrea Arroyo Ramo, Michel Roger, Romain Gojon, Thierry Jardin, Stéphane Moreau
{"title":"Aeroacoustic investigations of a rotor-beam configuration in small-size drones.","authors":"Jose Rendón-Arredondo, Emma Vella, Andrea Arroyo Ramo, Michel Roger, Romain Gojon, Thierry Jardin, Stéphane Moreau","doi":"10.1121/10.0038975","DOIUrl":"10.1121/10.0038975","url":null,"abstract":"<p><p>Various aeroacoustic mechanisms involved in a rotor-beam configuration typically encountered in small-size drones in hover conditions are investigated both numerically and analytically, complemented with experimental data. High-fidelity lattice-Boltzmann method (LBM) simulations are performed on the complete experimental setup, capturing both the aerodynamic and the acoustic features of the configuration. The far-field noise is obtained by applying the Ffowcs Williams and Hawkings (FW-H) acoustic analogy. The rotor noise is also modeled as the sum of thickness noise, steady and unsteady loading noise corresponding to potential interactions between the blades and the beam. The analytical model of rotor noise relies on a strip theory, combining input velocity profiles from LBM and Sears's blade response function for each strip, and the FW-H analogy formulated in the frequency domain. The beam noise is modeled using a similar strip theory and a response model to the circulation of passing blades, based on the incompressible potential flow theory around a circular cylinder. Aerodynamic and acoustic results from the simulation and the models are in good agreement with measurements. Unsteady loading noise is found dominant for all tones for the present rotor-beam configuration corresponding to a small chord-to-beam diameter ratio. The three-dimensional directivities of some sound harmonics also have a unique wavy pattern in the rotor plane.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"1091-1102"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144855670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A proposal for epithelial dominance in extremely high fundamental frequency vocalizations.","authors":"Ingo R Titze, Tobias Riede","doi":"10.1121/10.0038968","DOIUrl":"10.1121/10.0038968","url":null,"abstract":"<p><p>In this hypothesis article, we explore the upper limit of the fundamental frequency in vocalization. Most mammalian vocalizations are produced by airflow-induced, self-sustained vibration of vocal folds, with fundamental frequency being determined by multiple tissue layers in the folds, including muscle, ligament, and epithelial tissues. These layers contribute to vocal fold length, depth of vibration, and viscoelasticity needed for oscillation. While current vocal fold models explain a large range of frequencies, some extremely high-frequency vocalizations (e.g., whistle voice in humans) remain unexplained based on known tissue properties. We hypothesize that the thin layers near the epithelial surface become primary contributors to elasticity at high frequencies. Anatomical studies indicate weak allometric scaling in the epithelium, i.e., number of epithelial cell layers and thickness of the epithelium scale weakly with body size. This could allow species to produce frequencies outside the typical size-dependent spectral range if this layer dominates. Computational simulations using tissue property data support this hypothesis. We propose a model in which epithelial cells combined with collagen fibers in the lamina densa form structures capable of generating fundamental frequencies in the kilohertz range with minimal depths of vibration.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"1283-1295"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12458992/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144958615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effects of speaking rate on f0 trajectories: Evidence from Thai lexical tones.","authors":"Francesco Burroni, James Kirby","doi":"10.1121/10.0038965","DOIUrl":"10.1121/10.0038965","url":null,"abstract":"<p><p>This paper investigates the effect of speaking rate on tone production in Thai through two production experiments involving 44 speakers. Our analyses reveal clear trends: as speech rate increases, Thai speakers predominantly adjust the rate of fundamental frequency (f0) change to maintain tonal contours, with minimal alterations to contour shapes and values. Additionally, speakers tend to elevate the f0 onset value for most tones, positioning the contour within a higher f0 space at faster rates. While effects resembling \"truncation\" and undershoot of f0 contours occur in specific tonal combinations at faster speaking rates, we suggest that they are best attributed to global retiming of laryngeal commands governing f0 production. The strategies observed in laryngeal adjustments during faster speech rates are, thus, reminiscent of those observed in supralaryngeal articulation, suggesting the existence of a unified mechanism where rate effects impact both laryngeal and supralaryngeal articulation.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"1204-1226"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144873783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liu Zhang, Shengchun Piao, Junyuan Guo, Xiaohan Wang
{"title":"Variable frequency-based multi-frame coherent track-before-detect method for weak tones in passive sonar.","authors":"Liu Zhang, Shengchun Piao, Junyuan Guo, Xiaohan Wang","doi":"10.1121/10.0037220","DOIUrl":"https://doi.org/10.1121/10.0037220","url":null,"abstract":"<p><p>Passive detection for weak tones remains a challenging topic. Tonal frequency trajectory can be extracted by combining the pre-processing based on the multi-frame coherent integration with track-before-detect (TBD) method. However, complex target maneuvers can lead to intricate variations in tonal frequency, limiting the coherent processing gain. To address this issue, a variable frequency-based multi-frame coherent track-before-detect method is proposed. The evolution of tonal frequency and phase across time frames is modeled using polynomial functions. We propose a state-space dynamical system model for the time-evolving tonal signal, where the state variables are defined as the tonal amplitude and the coefficients of the polynomial used to represent the tonal frequency. The optimal model order is then analyzed based on minimizing the coherent gain loss. Furthermore, an improved particle filtering algorithm is employed to implement the established TBD model. We design a data-adaptive sequential importance sampling method. By optimizing particle sampling based on high transition probabilities, a majority of particles can be distributed in high-likelihood regions. This enables high adaptability when the tonal frequency undergoes complex variations. Both simulation and processing results from SwellEx-96 experiment demonstrate that the proposed method can improve detection performance and reduce frequency estimation error.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"923-945"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144775697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correct normalization of phonon modes for Rayleigh waves.","authors":"Michael A Stroscio, Mitra Dutta","doi":"10.1121/10.0038957","DOIUrl":"https://doi.org/10.1121/10.0038957","url":null,"abstract":"<p><p>Phonon mode normalizations derived in this paper are based on a consistent approach for normalizing Rayleigh waves. These Rayleigh waves are being used increasingly for nanoscale applications, including quantum information technology, and currently the literature is full of unnecessary approximate forms of these normalizations. The self-consistency of these derivations, for commonly used Rayleigh wave modes, is based on the second quantization procedure where the mechanical energy in the modes is equated with the energy of a phonon mode.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"1072-1076"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144835479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Grant Milne, Jennifer Miksis-Olds, Dylan Wilford, Jennifer Dijkstra, Bonnie Brown
{"title":"Identifying sentinel indicators of acoustic propagation conditions using the soundscape codea).","authors":"Grant Milne, Jennifer Miksis-Olds, Dylan Wilford, Jennifer Dijkstra, Bonnie Brown","doi":"10.1121/10.0038944","DOIUrl":"https://doi.org/10.1121/10.0038944","url":null,"abstract":"<p><p>Substrate composition in shallow water environments, including biological communities, strongly impacts transmission loss as propagating sounds are repeatedly reflected, scattered, and absorbed by the air-water interface and seafloor [Farcas, Thompson, and Merchant (2016). Environ. Assess. Rev. 57, 114-122]. The soundscape code (SSC), a technique that implements a collection of metrics to provide rapid, quantitative assessment of soundscape properties, was used to characterize coastal habitats in the Gulf of Maine. To determine whether SSC metrics of amplitude, impulsivity, uniformity, and periodicity serve as sentinel indicators of marine habitat composition and spatial distribution, hydrophones were deployed in three coastal habitat types (sand, macroalgae, and eelgrass) containing varied acoustic propagative properties. Deployments were replicated in four geographic locations along the Maine and New Hampshire coastlines. Hourly metrics were calculated across five frequency bands (low, 10-100 Hz; mid 100-1000 Hz; high1, 10 kHz; ultra-high 10-144 kHz, and broadband 10 Hz-144 kHz) and employed for multivariate statistical analysis to draw direct comparisons among soundscapes. Discriminant analysis revealed that habitats and geographic regions could be differentiated with high accuracy using SSC metrics as predictors. This finding supports the use of these metrics as sentinel indicators of acoustic propagation conditions in coastal Gulf of Maine ecosystems.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"1319-1333"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144958405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Filippo Nelli, Danica Tothova, Andrew Ooi, Richard Manasseh
{"title":"Sound amplitude of discrete bubbles entrained by a breaking wave.","authors":"Filippo Nelli, Danica Tothova, Andrew Ooi, Richard Manasseh","doi":"10.1121/10.0039053","DOIUrl":"https://doi.org/10.1121/10.0039053","url":null,"abstract":"<p><p>The exchange of gases between the ocean and the atmosphere plays a major role in regulating global climate, influencing processes like carbon sequestration and the balance of atmospheric gases. This paper investigates the acoustic emissions generated from the bubbles produced by breaking waves and analyzes the relationship between bubble diameter and sound intensity. This relationship is important for estimating bubble size distributions and thus achieving a better understanding of the gas exchange between ocean and atmosphere. Experiments were conducted in a controlled laboratory setting, generating linear focused breaking waves. High-speed cameras and four hydrophones captured synchronised audio and video data from bubble formation and acoustic emission events. Results are characterized by a polynomial relationship between bubble size and acoustic emission, showing that larger bubbles produced louder sounds. This behaviour is consistent with underwater bubbles generated from other, more fundamental mechanisms, such as underwater nozzles and plunging water streams and jets.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"1443-1450"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144958665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xueli Sheng, Hang Dong, Bingyu Shi, Siyuan Cang, Kejing Sun, Yan Wang
{"title":"An interference suppression method in underwater acoustics based on Riemannian geometry.","authors":"Xueli Sheng, Hang Dong, Bingyu Shi, Siyuan Cang, Kejing Sun, Yan Wang","doi":"10.1121/10.0038753","DOIUrl":"https://doi.org/10.1121/10.0038753","url":null,"abstract":"<p><p>Array signal processing such as direction of arrival estimation and target localization is significantly impacted by strong interference. In this work, we propose a wideband interference suppression method based on the Riemannian geometry of the manifold of Hermitian positive definite matrices, specifically designed for use in passive sonar systems. We demonstrate that incorporating the Riemannian mean of the sample covariance matrix into conventional beamforming techniques results in a spatial spectrum that effectively rejects interference directions. The proposed interference suppression technique enhances the signal-to-interference ratio without requiring prior information, outperforming other competing approaches. Furthermore, the effectiveness of the proposed method is validated through numerical simulations and comparison with experimental data.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 2","pages":"996-1006"},"PeriodicalIF":2.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144794808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}