通过感知模型评估HRTF预处理方法在立体声渲染中的应用

IF 1.4 3区物理与天体物理 Q4 ACOUSTICS

Acta Acustica Pub Date : 2022-01-01 DOI:10.1051/aacus/2021055

Isaac Engel, Dan F. M. Goodman, L. Picinali

{"title":"通过感知模型评估HRTF预处理方法在立体声渲染中的应用","authors":"Isaac Engel, Dan F. M. Goodman, L. Picinali","doi":"10.1051/aacus/2021055","DOIUrl":null,"url":null,"abstract":"Binaural rendering of Ambisonics signals is a common way to reproduce spatial audio content. Processing Ambisonics signals at low spatial orders is desirable in order to reduce complexity, although it may degrade the perceived quality, in part due to the mismatch that occurs when a low-order Ambisonics signal is paired with a spatially dense head-related transfer function (HRTF). In order to alleviate this issue, the HRTF may be preprocessed so its spatial order is reduced. Several preprocessing methods have been proposed, but they have not been thoroughly compared yet. In this study, nine HRTF preprocessing methods were used to render anechoic binaural signals from Ambisonics representations of orders 1 to 44, and these were compared through perceptual hearing models in terms of localisation performance, externalisation and speech reception. This assessment was supported by numerical analyses of HRTF interpolation errors, interaural differences, perceptually-relevant spectral differences, and loudness stability. Models predicted that the binaural renderings’ accuracy increased with spatial order, as expected. A notable effect of the preprocessing method was observed: whereas all methods performed similarly at the highest spatial orders, some were considerably better at lower orders. A newly proposed method, BiMagLS, displayed the best performance overall and is recommended for the rendering of bilateral Ambisonics signals. The results, which were in line with previous literature, indirectly validate the perceptual models’ ability to predict listeners’ responses in a consistent and explicable manner.","PeriodicalId":48486,"journal":{"name":"Acta Acustica","volume":"14 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Assessing HRTF preprocessing methods for Ambisonics rendering through perceptual models\",\"authors\":\"Isaac Engel, Dan F. M. Goodman, L. Picinali\",\"doi\":\"10.1051/aacus/2021055\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Binaural rendering of Ambisonics signals is a common way to reproduce spatial audio content. Processing Ambisonics signals at low spatial orders is desirable in order to reduce complexity, although it may degrade the perceived quality, in part due to the mismatch that occurs when a low-order Ambisonics signal is paired with a spatially dense head-related transfer function (HRTF). In order to alleviate this issue, the HRTF may be preprocessed so its spatial order is reduced. Several preprocessing methods have been proposed, but they have not been thoroughly compared yet. In this study, nine HRTF preprocessing methods were used to render anechoic binaural signals from Ambisonics representations of orders 1 to 44, and these were compared through perceptual hearing models in terms of localisation performance, externalisation and speech reception. This assessment was supported by numerical analyses of HRTF interpolation errors, interaural differences, perceptually-relevant spectral differences, and loudness stability. Models predicted that the binaural renderings’ accuracy increased with spatial order, as expected. A notable effect of the preprocessing method was observed: whereas all methods performed similarly at the highest spatial orders, some were considerably better at lower orders. A newly proposed method, BiMagLS, displayed the best performance overall and is recommended for the rendering of bilateral Ambisonics signals. The results, which were in line with previous literature, indirectly validate the perceptual models’ ability to predict listeners’ responses in a consistent and explicable manner.\",\"PeriodicalId\":48486,\"journal\":{\"name\":\"Acta Acustica\",\"volume\":\"14 1\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Acustica\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1051/aacus/2021055\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Acustica","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1051/aacus/2021055","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 9

摘要

立体声信号的双耳渲染是再现空间音频内容的一种常用方法。为了降低复杂性，在低空间阶处理双声信号是可取的，尽管它可能会降低感知质量，部分原因是当低阶双声信号与空间密集的头部相关传递函数(HRTF)配对时发生不匹配。为了缓解这个问题，可以对HRTF进行预处理，降低其空间顺序。人们提出了几种预处理方法，但尚未对它们进行全面的比较。在本研究中，采用9种HRTF预处理方法对1阶至44阶的双耳暗声信号进行渲染，并通过感知听力模型对这些方法在定位性能、外化和语音接收方面进行比较。这一评价得到了HRTF插值误差、耳间差异、感知相关光谱差异和响度稳定性的数值分析的支持。模型预测，双耳渲染的准确性随着空间顺序的增加而增加，正如预期的那样。预处理方法的效果显著:虽然所有方法在最高空间阶上的表现相似，但有些方法在较低空间阶上的表现要好得多。一种新的方法，BiMagLS，显示出最好的整体性能，被推荐用于双侧立体声信号的渲染。结果与之前的文献一致，间接验证了感知模型以一致和可解释的方式预测听众反应的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Assessing HRTF preprocessing methods for Ambisonics rendering through perceptual models

Binaural rendering of Ambisonics signals is a common way to reproduce spatial audio content. Processing Ambisonics signals at low spatial orders is desirable in order to reduce complexity, although it may degrade the perceived quality, in part due to the mismatch that occurs when a low-order Ambisonics signal is paired with a spatially dense head-related transfer function (HRTF). In order to alleviate this issue, the HRTF may be preprocessed so its spatial order is reduced. Several preprocessing methods have been proposed, but they have not been thoroughly compared yet. In this study, nine HRTF preprocessing methods were used to render anechoic binaural signals from Ambisonics representations of orders 1 to 44, and these were compared through perceptual hearing models in terms of localisation performance, externalisation and speech reception. This assessment was supported by numerical analyses of HRTF interpolation errors, interaural differences, perceptually-relevant spectral differences, and loudness stability. Models predicted that the binaural renderings’ accuracy increased with spatial order, as expected. A notable effect of the preprocessing method was observed: whereas all methods performed similarly at the highest spatial orders, some were considerably better at lower orders. A newly proposed method, BiMagLS, displayed the best performance overall and is recommended for the rendering of bilateral Ambisonics signals. The results, which were in line with previous literature, indirectly validate the perceptual models’ ability to predict listeners’ responses in a consistent and explicable manner.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Acta Acustica ACOUSTICS-

CiteScore

2.80

自引率

21.40%

发文量

审稿时长

12 weeks

期刊介绍： Acta Acustica, the Journal of the European Acoustics Association (EAA). After the publication of its Journal Acta Acustica from 1993 to 1995, the EAA published Acta Acustica united with Acustica from 1996 to 2019. From 2020, the EAA decided to publish a journal in full Open Access. See Article Processing charges. Acta Acustica reports on original scientific research in acoustics and on engineering applications. The journal considers review papers, scientific papers, technical and applied papers, short communications, letters to the editor. From time to time, special issues and review articles are also published. For book reviews or doctoral thesis abstracts, please contact the Editor in Chief.