Tracking and separation of multiple moving speech sources via cardinality balanced multi-target multi Bernoulli (CBMeMBer) filter and time frequency masking
{"title":"Tracking and separation of multiple moving speech sources via cardinality balanced multi-target multi Bernoulli (CBMeMBer) filter and time frequency masking","authors":"Nicholas Chong, S. Nordholm, B. Vo, I. Murray","doi":"10.1109/ICCAIS.2016.7822441","DOIUrl":null,"url":null,"abstract":"In a “conference room scenario”, the number of speech sources are not known a priori and the number of speech sources which are active remains unknown as these speech sources appear and disappear throughout the measurement period. Furthermore, the speech sources are moving so their mixing parameters change with time. As a result of this, traditional source separation techniques are limited by their capability to properly attribute the correct mixing parameters to the respective sources. The “conference room scenario” problem is very challenging as it involves the localization, tracking and separation of a time varying number of moving speech sources. An online solution which systematically solves “conference room scenario” problem by solving the source localization, tracking and separation in stages is proposed in this paper.","PeriodicalId":407031,"journal":{"name":"2016 International Conference on Control, Automation and Information Sciences (ICCAIS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Control, Automation and Information Sciences (ICCAIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAIS.2016.7822441","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
In a “conference room scenario”, the number of speech sources are not known a priori and the number of speech sources which are active remains unknown as these speech sources appear and disappear throughout the measurement period. Furthermore, the speech sources are moving so their mixing parameters change with time. As a result of this, traditional source separation techniques are limited by their capability to properly attribute the correct mixing parameters to the respective sources. The “conference room scenario” problem is very challenging as it involves the localization, tracking and separation of a time varying number of moving speech sources. An online solution which systematically solves “conference room scenario” problem by solving the source localization, tracking and separation in stages is proposed in this paper.