基于人体测量学和KEMAR系数的深度神经网络过渡段建模

2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI:10.1109/GlobalSIP45357.2019.8969348

Saif S. Alotaibi, M. Wickert

{"title":"基于人体测量学和KEMAR系数的深度神经网络过渡段建模","authors":"Saif S. Alotaibi, M. Wickert","doi":"10.1109/GlobalSIP45357.2019.8969348","DOIUrl":null,"url":null,"abstract":"ITD and ILD, versus source arrival direction, serve as essential binaural cues for spatial hearing. Individualized ITD and ILD can be used to render better 3D audio than a non-individualized one. Due to the correlation between ITD and some anthropometric features, machine learning, such as principal component analysis (PCA) and deep neural networks (DNNs), have become important methods to deploy individualized ITDs. The available measured ITDs do not match the exact sound source directions. An ITD correction method will be presented to overcome the irregularities that occurr due to subject head movements during database creation measurements. KEMAR’s ITD coefficients are utilized to correct the misplacement of a subject’s ITD. DNNs are used to obtain a new subject’s ITD for 1250 different azimuth and elevation angles. Mean absolute error (MAE) is used to compare the proposed ITD model with the available analytical models.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ITD Modeling Based on Anthropometrics and KEMAR Coefficients Using Deep Neural Networks\",\"authors\":\"Saif S. Alotaibi, M. Wickert\",\"doi\":\"10.1109/GlobalSIP45357.2019.8969348\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ITD and ILD, versus source arrival direction, serve as essential binaural cues for spatial hearing. Individualized ITD and ILD can be used to render better 3D audio than a non-individualized one. Due to the correlation between ITD and some anthropometric features, machine learning, such as principal component analysis (PCA) and deep neural networks (DNNs), have become important methods to deploy individualized ITDs. The available measured ITDs do not match the exact sound source directions. An ITD correction method will be presented to overcome the irregularities that occurr due to subject head movements during database creation measurements. KEMAR’s ITD coefficients are utilized to correct the misplacement of a subject’s ITD. DNNs are used to obtain a new subject’s ITD for 1250 different azimuth and elevation angles. Mean absolute error (MAE) is used to compare the proposed ITD model with the available analytical models.\",\"PeriodicalId\":221378,\"journal\":{\"name\":\"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GlobalSIP45357.2019.8969348\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GlobalSIP45357.2019.8969348","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

相对于声源到达方向，过渡段和过渡段是空间听力的重要双耳提示。个性化的ITD和ILD可以用来呈现比非个性化的更好的3D音频。由于过渡段与一些人体特征之间存在相关性，机器学习，如主成分分析(PCA)和深度神经网络(dnn)已成为部署个性化过渡段的重要方法。可用的测量过渡段与确切的声源方向不匹配。将提出一种过渡段校正方法，以克服在数据库创建测量过程中由于受试者头部运动而产生的不规则性。KEMAR的过渡段系数被用来纠正受试者过渡段的错位。dnn用于获得1250个不同方位角和仰角的新受试者的过渡段。利用平均绝对误差(MAE)将本文提出的过渡段模型与现有的分析模型进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ITD Modeling Based on Anthropometrics and KEMAR Coefficients Using Deep Neural Networks

ITD and ILD, versus source arrival direction, serve as essential binaural cues for spatial hearing. Individualized ITD and ILD can be used to render better 3D audio than a non-individualized one. Due to the correlation between ITD and some anthropometric features, machine learning, such as principal component analysis (PCA) and deep neural networks (DNNs), have become important methods to deploy individualized ITDs. The available measured ITDs do not match the exact sound source directions. An ITD correction method will be presented to overcome the irregularities that occurr due to subject head movements during database creation measurements. KEMAR’s ITD coefficients are utilized to correct the misplacement of a subject’s ITD. DNNs are used to obtain a new subject’s ITD for 1250 different azimuth and elevation angles. Mean absolute error (MAE) is used to compare the proposed ITD model with the available analytical models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)

自引率

0.00%

发文量