Ondrej Novotný, P. Matejka, O. Glembek, Oldrich Plchot, F. Grézl, L. Burget, J. Černocký
{"title":"Analysis of the DNN-based SRE systems in multi-language conditions","authors":"Ondrej Novotný, P. Matejka, O. Glembek, Oldrich Plchot, F. Grézl, L. Burget, J. Černocký","doi":"10.1109/SLT.2016.7846265","DOIUrl":null,"url":null,"abstract":"This paper analyzes the behavior of our state-of-the-art Deep Neural Network/i-vector/PLDA-based speaker recognition systems in multi-language conditions. On the “Language Pack” of the PRISM set, we evaluate the systems' performance using the NIST's standard metrics. We show that not only the gain from using DNNs vanishes, nor using dedicated DNNs for target conditions helps, but also the DNN-based systems tend to produce de-calibrated scores under the studied conditions. This work gives suggestions for directions of future research rather than any particular solutions to these issues.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846265","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper analyzes the behavior of our state-of-the-art Deep Neural Network/i-vector/PLDA-based speaker recognition systems in multi-language conditions. On the “Language Pack” of the PRISM set, we evaluate the systems' performance using the NIST's standard metrics. We show that not only the gain from using DNNs vanishes, nor using dedicated DNNs for target conditions helps, but also the DNN-based systems tend to produce de-calibrated scores under the studied conditions. This work gives suggestions for directions of future research rather than any particular solutions to these issues.