Insights into the Incorporation of Signal Information in Binaural Signal Matching with Wearable Microphone Arrays

arXiv - EE - Audio and Speech Processing Pub Date : 2024-09-18 DOI:arxiv-2409.11731

Ami Berger, Vladimir Tourbabin, Jacob Donley, Zamir Ben-Hur, Boaz Rafaely

{"title":"Insights into the Incorporation of Signal Information in Binaural Signal Matching with Wearable Microphone Arrays","authors":"Ami Berger, Vladimir Tourbabin, Jacob Donley, Zamir Ben-Hur, Boaz Rafaely","doi":"arxiv-2409.11731","DOIUrl":null,"url":null,"abstract":"The increasing popularity of spatial audio in applications such as\nteleconferencing, entertainment, and virtual reality has led to the recent\ndevelopments of binaural reproduction methods. However, only a few of these\nmethods are well-suited for wearable and mobile arrays, which typically consist\nof a small number of microphones. One such method is binaural signal matching\n(BSM), which has been shown to produce high-quality binaural signals for\nwearable arrays. However, BSM may be suboptimal in cases of high\ndirect-to-reverberant ratio (DRR) as it is based on the diffuse sound field\nassumption. To overcome this limitation, previous studies incorporated\nsound-field models other than diffuse. However, this approach was not studied\ncomprehensively. This paper extensively investigates two BSM-based methods\ndesigned for high DRR scenarios. The methods incorporate a sound field model\ncomposed of direct and reverberant components.The methods are investigated both\nmathematically and using simulations, finally validated by a listening test.\nThe results show that the proposed methods can significantly improve the\nperformance of BSM , in particular in the direction of the source, while\npresenting only a negligible degradation in other directions. Furthermore, when\nsource direction estimation is inaccurate, performance of these methods degrade\nto equal that of the BSM, presenting a desired robustness quality.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11731","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The increasing popularity of spatial audio in applications such as teleconferencing, entertainment, and virtual reality has led to the recent developments of binaural reproduction methods. However, only a few of these methods are well-suited for wearable and mobile arrays, which typically consist of a small number of microphones. One such method is binaural signal matching (BSM), which has been shown to produce high-quality binaural signals for wearable arrays. However, BSM may be suboptimal in cases of high direct-to-reverberant ratio (DRR) as it is based on the diffuse sound field assumption. To overcome this limitation, previous studies incorporated sound-field models other than diffuse. However, this approach was not studied comprehensively. This paper extensively investigates two BSM-based methods designed for high DRR scenarios. The methods incorporate a sound field model composed of direct and reverberant components.The methods are investigated both mathematically and using simulations, finally validated by a listening test. The results show that the proposed methods can significantly improve the performance of BSM , in particular in the direction of the source, while presenting only a negligible degradation in other directions. Furthermore, when source direction estimation is inaccurate, performance of these methods degrade to equal that of the BSM, presenting a desired robustness quality.

查看原文本刊更多论文

利用可佩戴麦克风阵列在双耳信号匹配中纳入信号信息的启示

随着空间音频在电话会议、娱乐和虚拟现实等应用中的日益普及，双耳再现方法也随之发展起来。然而，只有少数方法非常适合通常由少量麦克风组成的可穿戴和移动阵列。其中一种方法是双耳信号匹配（BSM），这种方法已被证明可以为可穿戴阵列产生高质量的双耳信号。然而，由于双耳信号匹配法基于扩散声场假设，因此在直接与混响比（DRR）较高的情况下，双耳信号匹配法可能并不理想。为了克服这一局限性，以前的研究采用了扩散声场以外的声场模型。然而，这种方法并未得到全面研究。本文广泛研究了两种基于 BSM 的方法，它们设计用于高 DRR 场景。结果表明，所提出的方法可以显著提高 BSM 的性能，尤其是在声源方向上，而在其他方向上的性能下降可以忽略不计。此外，当声源方向估计不准确时，这些方法的性能会下降到与 BSM 相等，从而达到理想的鲁棒性质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - EE - Audio and Speech Processing

自引率

0.00%

发文量