S. Thakallapalli, Sudarsana Reddy Kadiri, S. Gangashetty
{"title":"Spectral Features derived from Single Frequency Filter for Multispeaker Localization","authors":"S. Thakallapalli, Sudarsana Reddy Kadiri, S. Gangashetty","doi":"10.1109/NCC48643.2020.9056007","DOIUrl":null,"url":null,"abstract":"In this paper, we present a multispeaker localization method using the time delay estimates obtained from the spectral features derived from the single frequency filter (SFF) representation. The mixture signals are transformed into SFF domain from which the temporal envelopes are extracted at each frequency. Subsequently, the spectral features such as mean and variance of temporal envelopes across frequencies are correlated for extracting the time delay estimates. Since these features emphasize the high SNR regions of the mixtures, correlation of the corresponding features across the channels leads to robust delay estimates in real acoustic environments. We study the efficacy of the developed approach by comparing its performance with the existing correlation based time delay estimation techniques. Both, a standard data set recorded in real-room acoustic environments and simulated data set are used for evaluations. It is observed that the localization performance of the proposed algorithm closely matches the performance of a state-of-the-art correlation approach and outperforms other approaches.","PeriodicalId":183772,"journal":{"name":"2020 National Conference on Communications (NCC)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC48643.2020.9056007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In this paper, we present a multispeaker localization method using the time delay estimates obtained from the spectral features derived from the single frequency filter (SFF) representation. The mixture signals are transformed into SFF domain from which the temporal envelopes are extracted at each frequency. Subsequently, the spectral features such as mean and variance of temporal envelopes across frequencies are correlated for extracting the time delay estimates. Since these features emphasize the high SNR regions of the mixtures, correlation of the corresponding features across the channels leads to robust delay estimates in real acoustic environments. We study the efficacy of the developed approach by comparing its performance with the existing correlation based time delay estimation techniques. Both, a standard data set recorded in real-room acoustic environments and simulated data set are used for evaluations. It is observed that the localization performance of the proposed algorithm closely matches the performance of a state-of-the-art correlation approach and outperforms other approaches.