{"title":"A Time-domain Unsupervised Learning Based Sound Source Localization Method","authors":"Yankun Huang, Xihong Wu, T. Qu","doi":"10.1109/ICICSP50920.2020.9232117","DOIUrl":null,"url":null,"abstract":"In recent years, deep neural networks have been applied in many fields. In this paper, a time-domain unsupervised learning based sound source localization method is proposed, where auto-encoder neural networks are adopted so that some operation like time-delay compensation can be removed and there is no need to prepare training data with precise alignment labels. In order to improve its performance, a training strategy based on the multi-task learning and acoustic transfer function is proposed as well, called joint training of alternating and splitting. Experiments show that the proposed method can learn the transmission characteristics, including the change of time delay and intensity. What’s more, the proposed method also has better performance compared with SRP-PHAT, MUSIC and two other neural networks based methods.","PeriodicalId":117760,"journal":{"name":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICSP50920.2020.9232117","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
In recent years, deep neural networks have been applied in many fields. In this paper, a time-domain unsupervised learning based sound source localization method is proposed, where auto-encoder neural networks are adopted so that some operation like time-delay compensation can be removed and there is no need to prepare training data with precise alignment labels. In order to improve its performance, a training strategy based on the multi-task learning and acoustic transfer function is proposed as well, called joint training of alternating and splitting. Experiments show that the proposed method can learn the transmission characteristics, including the change of time delay and intensity. What’s more, the proposed method also has better performance compared with SRP-PHAT, MUSIC and two other neural networks based methods.