Gajan Suthokumar, V. Sethu, Kaavya Sriskandaraja, E. Ambikairajah
{"title":"Adversarial Multi-Task Learning for Speaker Normalization in Replay Detection","authors":"Gajan Suthokumar, V. Sethu, Kaavya Sriskandaraja, E. Ambikairajah","doi":"10.1109/ICASSP40776.2020.9054322","DOIUrl":null,"url":null,"abstract":"Spoofing detection algorithms in voice biometrics are adversely affected by differences in the speech characteristics of the various target users. In this paper, we propose a novel speaker normalisation technique that employs adversarial multi-task learning to compensate for this speaker variability. The proposed system is designed to learn a feature space that discriminates between genuine and replayed speech while simultaneously reduces the discrimination between different speakers. We initially characterise the impact of speaker variability and quantify the effect of the proposed speaker normalisation technique directly on the feature distributions. Following this, we validate the technique on spoofing detection experiments carried out on two different corpora, ASVSpoof 2017 v2.0 and BTAS 2016 replay, and demonstrate its effectiveness. We obtain EER of 7.11% and 0.83% on the two corpora respectively, lower than that of all relevant baselines.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"18 1","pages":"6609-6613"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP40776.2020.9054322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Spoofing detection algorithms in voice biometrics are adversely affected by differences in the speech characteristics of the various target users. In this paper, we propose a novel speaker normalisation technique that employs adversarial multi-task learning to compensate for this speaker variability. The proposed system is designed to learn a feature space that discriminates between genuine and replayed speech while simultaneously reduces the discrimination between different speakers. We initially characterise the impact of speaker variability and quantify the effect of the proposed speaker normalisation technique directly on the feature distributions. Following this, we validate the technique on spoofing detection experiments carried out on two different corpora, ASVSpoof 2017 v2.0 and BTAS 2016 replay, and demonstrate its effectiveness. We obtain EER of 7.11% and 0.83% on the two corpora respectively, lower than that of all relevant baselines.