基于感知相关损失函数的跨数据集头部相关传递函数协调

IF 2.7 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of signal processing Pub Date : 2025-07-17 DOI:10.1109/OJSP.2025.3590248

Jiale Zhao;Dingding Yao;Junfeng Li

{"title":"基于感知相关损失函数的跨数据集头部相关传递函数协调","authors":"Jiale Zhao;Dingding Yao;Junfeng Li","doi":"10.1109/OJSP.2025.3590248","DOIUrl":null,"url":null,"abstract":"Head-Related Transfer Functions (HRTFs) play a vital role in binaural spatial audio rendering. With the release of numerous HRTF datasets in recent years, abundant data has become available to support HRTF-related research based on deep learning. However, measurement discrepancies across different datasets introduce significant variations in the data and directly merging these datasets may lead to systematic biases. The recent Listener Acoustic Personalization Challenge 2024 (European Signal Processing Conference) dealt with this issue, with the task of harmonizing different datasets to achieve lower classification accuracy while meeting thresholds over various localization metrics. To mitigate cross-dataset differences, this paper proposes a neural network-based HRTF harmonization approach aimed at eliminating dataset-specific properties embedded in the original measurements. The proposed method utilizes a perceptually relevant loss function, which jointly constrains multiple objectives, including interaural level differences, auditory-filter excitation patterns, and classification accuracy. Experimental results based on eight datasets demonstrate that the proposed approach can effectively minimize distributional disparities between datasets while mostly preserving localization performance. The classification accuracy for harmonized HRTFs between different datasets is reduced to as low as 31%, indicating a significant reduction in cross-dataset discrepancies. The proposed method ranked first in this challenge, which validates its effectiveness.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"865-875"},"PeriodicalIF":2.7000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11082560","citationCount":"0","resultStr":"{\"title\":\"Cross-Dataset Head-Related Transfer Function Harmonization Based on Perceptually Relevant Loss Function\",\"authors\":\"Jiale Zhao;Dingding Yao;Junfeng Li\",\"doi\":\"10.1109/OJSP.2025.3590248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Head-Related Transfer Functions (HRTFs) play a vital role in binaural spatial audio rendering. With the release of numerous HRTF datasets in recent years, abundant data has become available to support HRTF-related research based on deep learning. However, measurement discrepancies across different datasets introduce significant variations in the data and directly merging these datasets may lead to systematic biases. The recent Listener Acoustic Personalization Challenge 2024 (European Signal Processing Conference) dealt with this issue, with the task of harmonizing different datasets to achieve lower classification accuracy while meeting thresholds over various localization metrics. To mitigate cross-dataset differences, this paper proposes a neural network-based HRTF harmonization approach aimed at eliminating dataset-specific properties embedded in the original measurements. The proposed method utilizes a perceptually relevant loss function, which jointly constrains multiple objectives, including interaural level differences, auditory-filter excitation patterns, and classification accuracy. Experimental results based on eight datasets demonstrate that the proposed approach can effectively minimize distributional disparities between datasets while mostly preserving localization performance. The classification accuracy for harmonized HRTFs between different datasets is reduced to as low as 31%, indicating a significant reduction in cross-dataset discrepancies. The proposed method ranked first in this challenge, which validates its effectiveness.\",\"PeriodicalId\":73300,\"journal\":{\"name\":\"IEEE open journal of signal processing\",\"volume\":\"6 \",\"pages\":\"865-875\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-07-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11082560\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE open journal of signal processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11082560/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of signal processing","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11082560/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

头部相关传递函数（hrtf）在双耳空间音频渲染中起着至关重要的作用。近年来，随着大量HRTF数据集的发布，为基于深度学习的HRTF相关研究提供了丰富的数据支持。然而，不同数据集之间的测量差异会导致数据的显著变化，直接合并这些数据集可能会导致系统偏差。最近的听众声学个性化挑战2024（欧洲信号处理会议）处理了这个问题，其任务是协调不同的数据集，以达到较低的分类精度，同时满足各种定位指标的阈值。为了减轻跨数据集的差异，本文提出了一种基于神经网络的HRTF协调方法，旨在消除嵌入在原始测量中的数据集特定属性。该方法利用感知相关损失函数，共同约束多个目标，包括耳间电平差异、听觉滤波激励模式和分类精度。基于8个数据集的实验结果表明，该方法可以有效地减少数据集之间的分布差异，同时基本保持定位性能。不同数据集之间协调hrtf的分类准确率降至31%，表明跨数据集差异显著降低。该方法在本次挑战中排名第一，验证了其有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cross-Dataset Head-Related Transfer Function Harmonization Based on Perceptually Relevant Loss Function

Head-Related Transfer Functions (HRTFs) play a vital role in binaural spatial audio rendering. With the release of numerous HRTF datasets in recent years, abundant data has become available to support HRTF-related research based on deep learning. However, measurement discrepancies across different datasets introduce significant variations in the data and directly merging these datasets may lead to systematic biases. The recent Listener Acoustic Personalization Challenge 2024 (European Signal Processing Conference) dealt with this issue, with the task of harmonizing different datasets to achieve lower classification accuracy while meeting thresholds over various localization metrics. To mitigate cross-dataset differences, this paper proposes a neural network-based HRTF harmonization approach aimed at eliminating dataset-specific properties embedded in the original measurements. The proposed method utilizes a perceptually relevant loss function, which jointly constrains multiple objectives, including interaural level differences, auditory-filter excitation patterns, and classification accuracy. Experimental results based on eight datasets demonstrate that the proposed approach can effectively minimize distributional disparities between datasets while mostly preserving localization performance. The classification accuracy for harmonized HRTFs between different datasets is reduced to as low as 31%, indicating a significant reduction in cross-dataset discrepancies. The proposed method ranked first in this challenge, which validates its effectiveness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE open journal of signal processing

CiteScore

5.30

自引率

0.00%

发文量

审稿时长

22 weeks