{"title":"A dual-encoder U-net architecture with prior knowledge embedding for acoustic source mapping.","authors":"Haobo Jia, Feiran Yang, Xiaoqing Hu, Jun Yang","doi":"10.1121/10.0039104","DOIUrl":null,"url":null,"abstract":"<p><p>The deconvolution approach has become a standard method for high-resolution acoustic source mapping, but it suffers from a heavy computational burden. Deep learning-based methods have shown promising progress but often rely on single-type input features and ignore the position- and frequency-dependent variabilities of the point spread function (PSF), which leads to a decline in localization accuracy. This paper proposes a supervised learning framework based on dual-encoder U-net architecture to convert beamforming maps into a high-resolution map of true source strength distribution. Specifically, the model employs two individual encoders to extract complementary features from delay-and-sum and functional beamforming maps. Because the two maps provide distinct information on the same source strength distribution, a contrastive loss function is introduced to help encoders learn consistent latent features of sources. To characterize the PSF variations, a frequency encoder and position encoder are designed to embed prior knowledge, i.e., source frequency and grid positions, into the backbone network. The proposed model outperforms competing methods, on average, across four metrics for the simulation data and MIRACLE dataset and generalizes well across different numbers of sound sources and frequencies.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"158 3","pages":"1767-1782"},"PeriodicalIF":2.3000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Acoustical Society of America","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1121/10.0039104","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
Abstract
The deconvolution approach has become a standard method for high-resolution acoustic source mapping, but it suffers from a heavy computational burden. Deep learning-based methods have shown promising progress but often rely on single-type input features and ignore the position- and frequency-dependent variabilities of the point spread function (PSF), which leads to a decline in localization accuracy. This paper proposes a supervised learning framework based on dual-encoder U-net architecture to convert beamforming maps into a high-resolution map of true source strength distribution. Specifically, the model employs two individual encoders to extract complementary features from delay-and-sum and functional beamforming maps. Because the two maps provide distinct information on the same source strength distribution, a contrastive loss function is introduced to help encoders learn consistent latent features of sources. To characterize the PSF variations, a frequency encoder and position encoder are designed to embed prior knowledge, i.e., source frequency and grid positions, into the backbone network. The proposed model outperforms competing methods, on average, across four metrics for the simulation data and MIRACLE dataset and generalizes well across different numbers of sound sources and frequencies.
期刊介绍:
Since 1929 The Journal of the Acoustical Society of America has been the leading source of theoretical and experimental research results in the broad interdisciplinary study of sound. Subject coverage includes: linear and nonlinear acoustics; aeroacoustics, underwater sound and acoustical oceanography; ultrasonics and quantum acoustics; architectural and structural acoustics and vibration; speech, music and noise; psychology and physiology of hearing; engineering acoustics, transduction; bioacoustics, animal bioacoustics.