A Time-domain Unsupervised Learning Based Sound Source Localization Method

2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP) Pub Date : 2020-09-01 DOI:10.1109/ICICSP50920.2020.9232117

Yankun Huang, Xihong Wu, T. Qu

引用次数: 7

Abstract

In recent years, deep neural networks have been applied in many fields. In this paper, a time-domain unsupervised learning based sound source localization method is proposed, where auto-encoder neural networks are adopted so that some operation like time-delay compensation can be removed and there is no need to prepare training data with precise alignment labels. In order to improve its performance, a training strategy based on the multi-task learning and acoustic transfer function is proposed as well, called joint training of alternating and splitting. Experiments show that the proposed method can learn the transmission characteristics, including the change of time delay and intensity. What’s more, the proposed method also has better performance compared with SRP-PHAT, MUSIC and two other neural networks based methods.

查看原文本刊更多论文

基于时域无监督学习的声源定位方法

近年来，深度神经网络在许多领域得到了应用。本文提出了一种基于时域无监督学习的声源定位方法，该方法采用自编码器神经网络，省去了时延补偿等操作，无需准备带有精确对准标签的训练数据。为了提高其性能，提出了一种基于多任务学习和声传递函数的训练策略，即交替分割联合训练。实验表明，该方法可以学习传输特性，包括时延和强度的变化。与SRP-PHAT、MUSIC等两种基于神经网络的方法相比，该方法具有更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)

自引率

0.00%

发文量