NAT: Neural Acoustic Transfer for Interactive Scenes in Real Time.

IF 6.5

IEEE transactions on visualization and computer graphics Pub Date : 2025-10-06 DOI:10.1109/TVCG.2025.3617802

Xutong Jin, Bo Pang, Chenxi Xu, Xinyun Hou, Guoping Wang, Sheng Li

{"title":"NAT: Neural Acoustic Transfer for Interactive Scenes in Real Time.","authors":"Xutong Jin, Bo Pang, Chenxi Xu, Xinyun Hou, Guoping Wang, Sheng Li","doi":"10.1109/TVCG.2025.3617802","DOIUrl":null,"url":null,"abstract":"<p><p>Previous acoustic transfer methods rely on extensive precomputation and storage of data to enable real-time interaction and auditory feedback. However, these methods struggle with complex scenes, especially when dynamic changes in object position, material, and size significantly alter sound effects. These continuous variations lead to fluctuating acoustic transfer distributions, making it challenging to represent with basic data structures and render efficiently in real time. To address this challenge, we present Neural Acoustic Transfer, a novel approach that leverages implicit neural representations to encode acoustic transfer functions and their variations. This enables real-time prediction of dynamically evolving sound fields and their interactions with the environment under varying conditions. To efficiently generate high-quality training data for the neural acoustic field while avoiding reliance on mesh quality of a model, we develop a fast and efficient Monte-Carlo-based boundary element method (BEM) approximation, suitable for general scenarios with smooth Neumann boundary conditions. In addition, we devise strategies to mitigate potential singularities during the synthesis of training data, thereby enhancing its reliability. Together, these methods provide robust and accurate data that empower the neural network to effectively model complex sound radiation space. We demonstrate our method's numerical accuracy and runtime efficiency (within several milliseconds for 30s audio) through comprehensive validation and comparisons in diverse acoustic transfer scenarios. Our approach allows for efficient and accurate modeling of sound behavior in dynamically changing environments, which can benefit a wide range of interactive applications such as virtual reality, augmented reality, and advanced audio production.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TVCG.2025.3617802","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Previous acoustic transfer methods rely on extensive precomputation and storage of data to enable real-time interaction and auditory feedback. However, these methods struggle with complex scenes, especially when dynamic changes in object position, material, and size significantly alter sound effects. These continuous variations lead to fluctuating acoustic transfer distributions, making it challenging to represent with basic data structures and render efficiently in real time. To address this challenge, we present Neural Acoustic Transfer, a novel approach that leverages implicit neural representations to encode acoustic transfer functions and their variations. This enables real-time prediction of dynamically evolving sound fields and their interactions with the environment under varying conditions. To efficiently generate high-quality training data for the neural acoustic field while avoiding reliance on mesh quality of a model, we develop a fast and efficient Monte-Carlo-based boundary element method (BEM) approximation, suitable for general scenarios with smooth Neumann boundary conditions. In addition, we devise strategies to mitigate potential singularities during the synthesis of training data, thereby enhancing its reliability. Together, these methods provide robust and accurate data that empower the neural network to effectively model complex sound radiation space. We demonstrate our method's numerical accuracy and runtime efficiency (within several milliseconds for 30s audio) through comprehensive validation and comparisons in diverse acoustic transfer scenarios. Our approach allows for efficient and accurate modeling of sound behavior in dynamically changing environments, which can benefit a wide range of interactive applications such as virtual reality, augmented reality, and advanced audio production.

查看原文本刊更多论文

实时交互场景中的神经声学传递。

以前的声学传输方法依赖于大量的预计算和数据存储，以实现实时交互和听觉反馈。然而，这些方法很难处理复杂的场景，特别是当物体位置、材料和大小的动态变化显著改变声音效果时。这些连续的变化导致了声波传递分布的波动，使得用基本的数据结构来表示和有效地实时渲染变得具有挑战性。为了解决这一挑战，我们提出了神经声学传递，这是一种利用隐式神经表征来编码声学传递函数及其变化的新方法。这使得实时预测动态演变的声场及其在不同条件下与环境的相互作用成为可能。为了有效地生成高质量的神经声场训练数据，同时避免依赖模型的网格质量，我们开发了一种快速高效的基于蒙特卡罗的边界元近似方法（BEM），适用于具有光滑诺伊曼边界条件的一般场景。此外，我们还设计了一些策略来减轻训练数据合成过程中潜在的奇异性，从而提高其可靠性。总之，这些方法提供了鲁棒和准确的数据，使神经网络能够有效地模拟复杂的声辐射空间。通过综合验证和比较不同声学传输场景，我们证明了我们的方法的数值精度和运行效率（在几毫秒内为30秒音频）。我们的方法允许在动态变化的环境中对声音行为进行有效和准确的建模，这可以使虚拟现实，增强现实和高级音频制作等广泛的交互式应用受益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on visualization and computer graphics

自引率

0.00%

发文量