Adversarial Training for Multi-domain Speaker Recognition

2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP) Pub Date : 2020-11-17 DOI:10.1109/ISCSLP49672.2021.9362053

Qing Wang, Wei Rao, Pengcheng Guo, Lei Xie

{"title":"Adversarial Training for Multi-domain Speaker Recognition","authors":"Qing Wang, Wei Rao, Pengcheng Guo, Lei Xie","doi":"10.1109/ISCSLP49672.2021.9362053","DOIUrl":null,"url":null,"abstract":"In real-life applications, the performance of speaker recognition systems always degrades when there is a mismatch between training and evaluation data. Many domain adaptation methods have been successfully used for eliminating the domain mismatches in speaker recognition. However, usually both training and evaluation data themselves can be composed of several subsets. These inner variances of each dataset can also be considered as different domains. Different distributed subsets in source or target domain dataset can also cause multi-domain mismatches, which are influential to speaker recognition performance. In this study, we propose to use adversarial training for multi-domain speaker recognition to solve the domain mismatch and the dataset variance problems. By adopting the proposed method, we are able to obtain both multi-domain-invariant and speaker-discriminative speech representations for speaker recognition. Experimental results on DAC13 dataset indicate that the proposed method is not only effective to solve the multi-domain mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.","PeriodicalId":279828,"journal":{"name":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP49672.2021.9362053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

In real-life applications, the performance of speaker recognition systems always degrades when there is a mismatch between training and evaluation data. Many domain adaptation methods have been successfully used for eliminating the domain mismatches in speaker recognition. However, usually both training and evaluation data themselves can be composed of several subsets. These inner variances of each dataset can also be considered as different domains. Different distributed subsets in source or target domain dataset can also cause multi-domain mismatches, which are influential to speaker recognition performance. In this study, we propose to use adversarial training for multi-domain speaker recognition to solve the domain mismatch and the dataset variance problems. By adopting the proposed method, we are able to obtain both multi-domain-invariant and speaker-discriminative speech representations for speaker recognition. Experimental results on DAC13 dataset indicate that the proposed method is not only effective to solve the multi-domain mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.

查看原文本刊更多论文

多域说话人识别的对抗性训练

在实际应用中，当训练数据和评估数据不匹配时，说话人识别系统的性能往往会下降。在说话人识别中，已有许多领域自适应方法成功地用于消除领域不匹配。然而，通常训练和评估数据本身都可以由几个子集组成。每个数据集的这些内部方差也可以被视为不同的域。源域和目标域数据集中分布子集的不同也会导致多域不匹配，从而影响说话人识别性能。在本研究中，我们提出将对抗训练用于多域说话人识别，以解决域不匹配和数据集方差问题。采用该方法，我们既可以获得多域不变的语音表示，也可以获得说话人区分的语音表示。在DAC13数据集上的实验结果表明，该方法不仅能有效地解决多域不匹配问题，而且优于无监督域自适应方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)

自引率

0.00%

发文量