基于感知相关损失函数的跨数据集头部相关传递函数协调

IF 2.7 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Jiale Zhao;Dingding Yao;Junfeng Li
{"title":"基于感知相关损失函数的跨数据集头部相关传递函数协调","authors":"Jiale Zhao;Dingding Yao;Junfeng Li","doi":"10.1109/OJSP.2025.3590248","DOIUrl":null,"url":null,"abstract":"Head-Related Transfer Functions (HRTFs) play a vital role in binaural spatial audio rendering. With the release of numerous HRTF datasets in recent years, abundant data has become available to support HRTF-related research based on deep learning. However, measurement discrepancies across different datasets introduce significant variations in the data and directly merging these datasets may lead to systematic biases. The recent Listener Acoustic Personalization Challenge 2024 (European Signal Processing Conference) dealt with this issue, with the task of harmonizing different datasets to achieve lower classification accuracy while meeting thresholds over various localization metrics. To mitigate cross-dataset differences, this paper proposes a neural network-based HRTF harmonization approach aimed at eliminating dataset-specific properties embedded in the original measurements. The proposed method utilizes a perceptually relevant loss function, which jointly constrains multiple objectives, including interaural level differences, auditory-filter excitation patterns, and classification accuracy. Experimental results based on eight datasets demonstrate that the proposed approach can effectively minimize distributional disparities between datasets while mostly preserving localization performance. The classification accuracy for harmonized HRTFs between different datasets is reduced to as low as 31%, indicating a significant reduction in cross-dataset discrepancies. The proposed method ranked first in this challenge, which validates its effectiveness.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"865-875"},"PeriodicalIF":2.7000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11082560","citationCount":"0","resultStr":"{\"title\":\"Cross-Dataset Head-Related Transfer Function Harmonization Based on Perceptually Relevant Loss Function\",\"authors\":\"Jiale Zhao;Dingding Yao;Junfeng Li\",\"doi\":\"10.1109/OJSP.2025.3590248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Head-Related Transfer Functions (HRTFs) play a vital role in binaural spatial audio rendering. With the release of numerous HRTF datasets in recent years, abundant data has become available to support HRTF-related research based on deep learning. However, measurement discrepancies across different datasets introduce significant variations in the data and directly merging these datasets may lead to systematic biases. The recent Listener Acoustic Personalization Challenge 2024 (European Signal Processing Conference) dealt with this issue, with the task of harmonizing different datasets to achieve lower classification accuracy while meeting thresholds over various localization metrics. To mitigate cross-dataset differences, this paper proposes a neural network-based HRTF harmonization approach aimed at eliminating dataset-specific properties embedded in the original measurements. The proposed method utilizes a perceptually relevant loss function, which jointly constrains multiple objectives, including interaural level differences, auditory-filter excitation patterns, and classification accuracy. Experimental results based on eight datasets demonstrate that the proposed approach can effectively minimize distributional disparities between datasets while mostly preserving localization performance. The classification accuracy for harmonized HRTFs between different datasets is reduced to as low as 31%, indicating a significant reduction in cross-dataset discrepancies. The proposed method ranked first in this challenge, which validates its effectiveness.\",\"PeriodicalId\":73300,\"journal\":{\"name\":\"IEEE open journal of signal processing\",\"volume\":\"6 \",\"pages\":\"865-875\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-07-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11082560\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE open journal of signal processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11082560/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of signal processing","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11082560/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

头部相关传递函数(hrtf)在双耳空间音频渲染中起着至关重要的作用。近年来,随着大量HRTF数据集的发布,为基于深度学习的HRTF相关研究提供了丰富的数据支持。然而,不同数据集之间的测量差异会导致数据的显著变化,直接合并这些数据集可能会导致系统偏差。最近的听众声学个性化挑战2024(欧洲信号处理会议)处理了这个问题,其任务是协调不同的数据集,以达到较低的分类精度,同时满足各种定位指标的阈值。为了减轻跨数据集的差异,本文提出了一种基于神经网络的HRTF协调方法,旨在消除嵌入在原始测量中的数据集特定属性。该方法利用感知相关损失函数,共同约束多个目标,包括耳间电平差异、听觉滤波激励模式和分类精度。基于8个数据集的实验结果表明,该方法可以有效地减少数据集之间的分布差异,同时基本保持定位性能。不同数据集之间协调hrtf的分类准确率降至31%,表明跨数据集差异显著降低。该方法在本次挑战中排名第一,验证了其有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cross-Dataset Head-Related Transfer Function Harmonization Based on Perceptually Relevant Loss Function
Head-Related Transfer Functions (HRTFs) play a vital role in binaural spatial audio rendering. With the release of numerous HRTF datasets in recent years, abundant data has become available to support HRTF-related research based on deep learning. However, measurement discrepancies across different datasets introduce significant variations in the data and directly merging these datasets may lead to systematic biases. The recent Listener Acoustic Personalization Challenge 2024 (European Signal Processing Conference) dealt with this issue, with the task of harmonizing different datasets to achieve lower classification accuracy while meeting thresholds over various localization metrics. To mitigate cross-dataset differences, this paper proposes a neural network-based HRTF harmonization approach aimed at eliminating dataset-specific properties embedded in the original measurements. The proposed method utilizes a perceptually relevant loss function, which jointly constrains multiple objectives, including interaural level differences, auditory-filter excitation patterns, and classification accuracy. Experimental results based on eight datasets demonstrate that the proposed approach can effectively minimize distributional disparities between datasets while mostly preserving localization performance. The classification accuracy for harmonized HRTFs between different datasets is reduced to as low as 31%, indicating a significant reduction in cross-dataset discrepancies. The proposed method ranked first in this challenge, which validates its effectiveness.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.30
自引率
0.00%
发文量
0
审稿时长
22 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信