Individual HRTF Prediction Based on Anthropometric Data and Multi-Stage Model

2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI:10.1109/ICMEW59549.2023.00060

Yinliang Qiu, Zhiyu Li, Jing Wang

引用次数: 0

Abstract

Getting individual head related transfer function (HRTF) is an important step in rendering binaural immersive audio. Individual HRTF can provide a more realistic experience than general HRTF. For more accurate prediction results, we propose a multi-stage model perform individual HRTF prediction based on anthropometric data. This model can combine global and local features through different stages. In the first stage, light gradient boosting machine(LightGBM) is chosen as decision tress model to predict HRTF according to anthropometric data and different angels. In the second stage, Transformer encoder is chosen to learn the global information between different frequency points. According to the experimental results, the effect of using a multi-stage model is better than that of a single model. The spectral distortion of the results predicted by our model is smaller, which can illustrate the effectiveness of our model.

查看原文本刊更多论文

基于人体测量数据和多阶段模型的个体HRTF预测

获取个体头部相关传递函数(HRTF)是实现双耳沉浸式音频渲染的重要步骤。单独的HRTF可以提供比一般HRTF更真实的体验。为了获得更准确的预测结果，我们提出了一种基于人体测量数据的多阶段模型来进行单独的HRTF预测。该模型可以通过不同的阶段将全局和局部特征结合起来。第一阶段，选择光梯度增强机(LightGBM)作为决策树模型，根据人体测量数据和不同角度预测HRTF。在第二阶段，选择变压器编码器学习不同频率点之间的全局信息。实验结果表明，采用多级模型比单一模型的效果更好。模型预测结果的光谱畸变较小，说明了模型的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

自引率

0.00%

发文量