高效逼近子带中的头部相关传递函数，实现精确声音定位

IEEE Transactions on Audio Speech and Language Processing Pub Date : 2015-07-01

Damián Marelli, Robert Baumgartner, Piotr Majdak

{"title":"高效逼近子带中的头部相关传递函数，实现精确声音定位","authors":"Damián Marelli, Robert Baumgartner, Piotr Majdak","doi":"","DOIUrl":null,"url":null,"abstract":"Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology and are essential for listeners to localize sound sources in virtual auditory displays. Since rendering complex virtual scenes is computationally demanding, we propose four algorithms for efficiently representing HRTFs in subbands, i.e., as an analysis filterbank (FB) followed by a transfer matrix and a synthesis FB. All four algorithms use sparse approximation procedures to minimize the computational complexity while maintaining perceptually relevant HRTF properties. The first two algorithms separately optimize the complexity of the transfer matrix associated to each HRTF for fixed FBs. The other two algorithms jointly optimize the FBs and transfer matrices for complete HRTF sets by two variants. The first variant aims at minimizing the complexity of the transfer matrices, while the second one does it for the FBs. Numerical experiments investigate the latency-complexity trade-off and show that the proposed methods offer significant computational savings when compared with other available approaches. Psychoacoustic localization experiments were modeled and conducted to find a reasonable approximation tolerance so that no significant localization performance degradation was introduced by the subband representation.","PeriodicalId":55014,"journal":{"name":"IEEE Transactions on Audio Speech and Language Processing","volume":"23 7","pages":"1130-1143"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4678625/pdf/","citationCount":"0","resultStr":"{\"title\":\"Efficient Approximation of Head-Related Transfer Functions in Subbands for Accurate Sound Localization.\",\"authors\":\"Damián Marelli, Robert Baumgartner, Piotr Majdak\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology and are essential for listeners to localize sound sources in virtual auditory displays. Since rendering complex virtual scenes is computationally demanding, we propose four algorithms for efficiently representing HRTFs in subbands, i.e., as an analysis filterbank (FB) followed by a transfer matrix and a synthesis FB. All four algorithms use sparse approximation procedures to minimize the computational complexity while maintaining perceptually relevant HRTF properties. The first two algorithms separately optimize the complexity of the transfer matrix associated to each HRTF for fixed FBs. The other two algorithms jointly optimize the FBs and transfer matrices for complete HRTF sets by two variants. The first variant aims at minimizing the complexity of the transfer matrices, while the second one does it for the FBs. Numerical experiments investigate the latency-complexity trade-off and show that the proposed methods offer significant computational savings when compared with other available approaches. Psychoacoustic localization experiments were modeled and conducted to find a reasonable approximation tolerance so that no significant localization performance degradation was introduced by the subband representation.\",\"PeriodicalId\":55014,\"journal\":{\"name\":\"IEEE Transactions on Audio Speech and Language Processing\",\"volume\":\"23 7\",\"pages\":\"1130-1143\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4678625/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Audio Speech and Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Audio Speech and Language Processing","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

头部相关传递函数（HRTFs）描述了人体形态对传入声音的声学过滤，对于听者在虚拟听觉显示中定位声源至关重要。由于渲染复杂的虚拟场景对计算要求很高，因此我们提出了四种算法，用于在子带中有效地表示 HRTF，即作为分析滤波器库（FB），然后是传递矩阵和合成滤波器库。所有四种算法都使用稀疏近似程序，以最大限度地降低计算复杂度，同时保持与感知相关的 HRTF 特性。前两种算法分别优化了固定 FB 的每个 HRTF 相关传递矩阵的复杂度。另外两种算法通过两种变体联合优化了完整 HRTF 集的 FB 和传递矩阵。第一个变体旨在最小化传递矩阵的复杂度，而第二个变体则针对 FB 进行优化。数值实验研究了延迟与复杂性之间的权衡，结果表明，与其他可用方法相比，所提出的方法大大节省了计算量。为了找到一个合理的近似容差，使子带表示不会带来明显的定位性能下降，我们建立了心理声学定位模型并进行了实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

本刊更多论文

Efficient Approximation of Head-Related Transfer Functions in Subbands for Accurate Sound Localization.

Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology and are essential for listeners to localize sound sources in virtual auditory displays. Since rendering complex virtual scenes is computationally demanding, we propose four algorithms for efficiently representing HRTFs in subbands, i.e., as an analysis filterbank (FB) followed by a transfer matrix and a synthesis FB. All four algorithms use sparse approximation procedures to minimize the computational complexity while maintaining perceptually relevant HRTF properties. The first two algorithms separately optimize the complexity of the transfer matrix associated to each HRTF for fixed FBs. The other two algorithms jointly optimize the FBs and transfer matrices for complete HRTF sets by two variants. The first variant aims at minimizing the complexity of the transfer matrices, while the second one does it for the FBs. Numerical experiments investigate the latency-complexity trade-off and show that the proposed methods offer significant computational savings when compared with other available approaches. Psychoacoustic localization experiments were modeled and conducted to find a reasonable approximation tolerance so that no significant localization performance degradation was introduced by the subband representation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Audio Speech and Language Processing 工程技术-工程：电子与电气

自引率

0.00%

发文量

审稿时长

24.0 months

期刊介绍： The IEEE Transactions on Audio, Speech and Language Processing covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language. In particular, audio processing also covers auditory modeling, acoustic modeling and source separation. Speech processing also covers speech production and perception, adaptation, lexical modeling and speaker recognition. Language processing also covers spoken language understanding, translation, summarization, mining, general language modeling, as well as spoken dialog systems.