Xutong Jin, Bo Pang, Chenxi Xu, Xinyun Hou, Guoping Wang, Sheng Li
{"title":"NAT: Neural Acoustic Transfer for Interactive Scenes in Real Time.","authors":"Xutong Jin, Bo Pang, Chenxi Xu, Xinyun Hou, Guoping Wang, Sheng Li","doi":"10.1109/TVCG.2025.3617802","DOIUrl":null,"url":null,"abstract":"<p><p>Previous acoustic transfer methods rely on extensive precomputation and storage of data to enable real-time interaction and auditory feedback. However, these methods struggle with complex scenes, especially when dynamic changes in object position, material, and size significantly alter sound effects. These continuous variations lead to fluctuating acoustic transfer distributions, making it challenging to represent with basic data structures and render efficiently in real time. To address this challenge, we present Neural Acoustic Transfer, a novel approach that leverages implicit neural representations to encode acoustic transfer functions and their variations. This enables real-time prediction of dynamically evolving sound fields and their interactions with the environment under varying conditions. To efficiently generate high-quality training data for the neural acoustic field while avoiding reliance on mesh quality of a model, we develop a fast and efficient Monte-Carlo-based boundary element method (BEM) approximation, suitable for general scenarios with smooth Neumann boundary conditions. In addition, we devise strategies to mitigate potential singularities during the synthesis of training data, thereby enhancing its reliability. Together, these methods provide robust and accurate data that empower the neural network to effectively model complex sound radiation space. We demonstrate our method's numerical accuracy and runtime efficiency (within several milliseconds for 30s audio) through comprehensive validation and comparisons in diverse acoustic transfer scenarios. Our approach allows for efficient and accurate modeling of sound behavior in dynamically changing environments, which can benefit a wide range of interactive applications such as virtual reality, augmented reality, and advanced audio production.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TVCG.2025.3617802","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Previous acoustic transfer methods rely on extensive precomputation and storage of data to enable real-time interaction and auditory feedback. However, these methods struggle with complex scenes, especially when dynamic changes in object position, material, and size significantly alter sound effects. These continuous variations lead to fluctuating acoustic transfer distributions, making it challenging to represent with basic data structures and render efficiently in real time. To address this challenge, we present Neural Acoustic Transfer, a novel approach that leverages implicit neural representations to encode acoustic transfer functions and their variations. This enables real-time prediction of dynamically evolving sound fields and their interactions with the environment under varying conditions. To efficiently generate high-quality training data for the neural acoustic field while avoiding reliance on mesh quality of a model, we develop a fast and efficient Monte-Carlo-based boundary element method (BEM) approximation, suitable for general scenarios with smooth Neumann boundary conditions. In addition, we devise strategies to mitigate potential singularities during the synthesis of training data, thereby enhancing its reliability. Together, these methods provide robust and accurate data that empower the neural network to effectively model complex sound radiation space. We demonstrate our method's numerical accuracy and runtime efficiency (within several milliseconds for 30s audio) through comprehensive validation and comparisons in diverse acoustic transfer scenarios. Our approach allows for efficient and accurate modeling of sound behavior in dynamically changing environments, which can benefit a wide range of interactive applications such as virtual reality, augmented reality, and advanced audio production.