Y. Solewicz, Noa Cohen, Johan Rohdin, S. Madikeri, Jan ”Honza” Čercnocký
{"title":"Speaker Recognition on Mono-Channel Telephony Recordings","authors":"Y. Solewicz, Noa Cohen, Johan Rohdin, S. Madikeri, Jan ”Honza” Čercnocký","doi":"10.21437/odyssey.2022-27","DOIUrl":null,"url":null,"abstract":"Conversations stored as mono data is a common problem in many real world speaker recognition applications. In this paper, we focus on investigative scenarios, where a number of mono telephone conversations are available for a speaker of interest. For example, a human operator may have verified that the speaker is present in these conversations. We propose several approaches for automatically creating enrollment models for the speaker of interest from such data. We then use the enrollment models to search for appearances of the speaker of interest in other calls. We analyze the performance of the different method on two dataset that matches our scenario, one is from a simulated case and one is from a real case. and real databases. We show that even simple methods not requiring tunable settings can perform well in these challenging and unpredicted scenarios. Nevertheless, bigger databases should be used to confirm these findings. The meth-198","PeriodicalId":315750,"journal":{"name":"The Speaker and Language Recognition Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Speaker and Language Recognition Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/odyssey.2022-27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Conversations stored as mono data is a common problem in many real world speaker recognition applications. In this paper, we focus on investigative scenarios, where a number of mono telephone conversations are available for a speaker of interest. For example, a human operator may have verified that the speaker is present in these conversations. We propose several approaches for automatically creating enrollment models for the speaker of interest from such data. We then use the enrollment models to search for appearances of the speaker of interest in other calls. We analyze the performance of the different method on two dataset that matches our scenario, one is from a simulated case and one is from a real case. and real databases. We show that even simple methods not requiring tunable settings can perform well in these challenging and unpredicted scenarios. Nevertheless, bigger databases should be used to confirm these findings. The meth-198