学习用于实时声波波束成形的可解释端到端网络

IF 4.3 2区工程技术 Q1 ACOUSTICS

Journal of Sound and Vibration Pub Date : 2024-07-10 DOI:10.1016/j.jsv.2024.118620

{"title":"学习用于实时声波波束成形的可解释端到端网络","authors":"","doi":"10.1016/j.jsv.2024.118620","DOIUrl":null,"url":null,"abstract":"<div><p>Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.</p></div>","PeriodicalId":17233,"journal":{"name":"Journal of Sound and Vibration","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning an interpretable end-to-end network for real-time acoustic beamforming\",\"authors\":\"\",\"doi\":\"10.1016/j.jsv.2024.118620\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.</p></div>\",\"PeriodicalId\":17233,\"journal\":{\"name\":\"Journal of Sound and Vibration\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Sound and Vibration\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0022460X24003821\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Sound and Vibration","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022460X24003821","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

摘要

最近，声音监测和声源定位等多种形式的音频工业应用开始利用配备麦克风阵列的智能多模态设备。遗憾的是，基于模型的方法由于计算复杂度高以及难以适当选择用户确定的参数，通常难以用于此类设备。作为一种替代方法，人们可以使用基于深度网络的方法，但这些方法往往难以通用，也不能直接生成所需的波束成形图。本文提出了一种计算效率高的声波波束成形算法，该算法可以展开形成一个基于模型的深度学习网络，用于实时成像，本文称之为 DAMAS-FISTA-Net。通过利用声波波束成形器的自然结构，所提出的网络继承了声学系统的物理知识，从而学习传播的基本物理特性。因此，所有网络参数都可以在基于模型的反向传播先验指导下进行端到端学习。值得注意的是，所提出的网络具有出色的可解释性，能够直接处理原始数据。使用模拟数据和实际数据进行的大量数值实验表明，与其他方法相比，DAMAS-FISTA-网络的性能更佳。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning an interpretable end-to-end network for real-time acoustic beamforming

Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Sound and Vibration 工程技术-工程：机械

CiteScore

9.10

自引率

10.60%

发文量

551

审稿时长

69 days

期刊介绍： The Journal of Sound and Vibration (JSV) is an independent journal devoted to the prompt publication of original papers, both theoretical and experimental, that provide new information on any aspect of sound or vibration. There is an emphasis on fundamental work that has potential for practical application. JSV was founded and operates on the premise that the subject of sound and vibration requires a journal that publishes papers of a high technical standard across the various subdisciplines, thus facilitating awareness of techniques and discoveries in one area that may be applicable in others.