多注册场景下基于lda的说话人验证

2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP) Pub Date : 2021-01-24 DOI:10.1109/ISCSLP49672.2021.9362113

Meet H. Soni, Ashish Panda

{"title":"多注册场景下基于lda的说话人验证","authors":"Meet H. Soni, Ashish Panda","doi":"10.1109/ISCSLP49672.2021.9362113","DOIUrl":null,"url":null,"abstract":"Multi-Enrollment scoring scenario, where multiple utterances are available for an enrollment speaker, is one of the less explored problems in the Probabilistic Linear Discriminant Analysis (PLDA) scoring literature. Since the closed-form PLDA scoring formula for multi-enrollment scenario is impractical, alternate heuristic approaches are widely used for such scenarios in both i-vector and x-vector based speaker verification systems. In this paper, we describe an Expected Vector approach to obtain a vector from multiple enrollment utterances. Expected Vector approach uses a trained PLDA model to compute the expected class center given a set of vectors for that particular PLDA model. By using such an approach, a more meaningful class center representation can be obtained. This vector can be used to score a trial using two-vector scoring formula for a given PLDA model. We compare the performance of the proposed approach with various heuristic approaches and show that it provides significant improvements in terms of Equal Error Rate (EER) and minimum Detection Cost Function (minDCF). We show our results on x-vector system trained on Voxceleb dataset with various implementations of PLDA and trials designed on Voxceleb and Librispeech dataset.","PeriodicalId":279828,"journal":{"name":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"LDA-based Speaker Verification in Multi-Enrollment Scenario using Expected Vector Approach\",\"authors\":\"Meet H. Soni, Ashish Panda\",\"doi\":\"10.1109/ISCSLP49672.2021.9362113\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-Enrollment scoring scenario, where multiple utterances are available for an enrollment speaker, is one of the less explored problems in the Probabilistic Linear Discriminant Analysis (PLDA) scoring literature. Since the closed-form PLDA scoring formula for multi-enrollment scenario is impractical, alternate heuristic approaches are widely used for such scenarios in both i-vector and x-vector based speaker verification systems. In this paper, we describe an Expected Vector approach to obtain a vector from multiple enrollment utterances. Expected Vector approach uses a trained PLDA model to compute the expected class center given a set of vectors for that particular PLDA model. By using such an approach, a more meaningful class center representation can be obtained. This vector can be used to score a trial using two-vector scoring formula for a given PLDA model. We compare the performance of the proposed approach with various heuristic approaches and show that it provides significant improvements in terms of Equal Error Rate (EER) and minimum Detection Cost Function (minDCF). We show our results on x-vector system trained on Voxceleb dataset with various implementations of PLDA and trials designed on Voxceleb and Librispeech dataset.\",\"PeriodicalId\":279828,\"journal\":{\"name\":\"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCSLP49672.2021.9362113\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP49672.2021.9362113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

多注册评分场景，即一个注册说话人可以使用多个话语，是概率线性判别分析(PLDA)评分文献中研究较少的问题之一。由于多注册场景的封闭形式PLDA评分公式不切实际，因此在基于i向量和x向量的说话人验证系统中，替代启发式方法被广泛用于此类场景。在本文中，我们描述了一种期望向量方法来从多个注册话语中获得向量。期望向量方法使用经过训练的PLDA模型来计算给定该特定PLDA模型的一组向量的期望类中心。通过这种方法，可以得到更有意义的类中心表示。该向量可用于使用给定PLDA模型的双向量评分公式对试验进行评分。我们将所提出的方法与各种启发式方法的性能进行了比较，并表明它在等错误率(EER)和最小检测成本函数(minDCF)方面提供了显着改进。我们通过在Voxceleb和librisspeech数据集上设计的各种PLDA实现和试验，展示了在Voxceleb数据集上训练的x向量系统的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

LDA-based Speaker Verification in Multi-Enrollment Scenario using Expected Vector Approach

Multi-Enrollment scoring scenario, where multiple utterances are available for an enrollment speaker, is one of the less explored problems in the Probabilistic Linear Discriminant Analysis (PLDA) scoring literature. Since the closed-form PLDA scoring formula for multi-enrollment scenario is impractical, alternate heuristic approaches are widely used for such scenarios in both i-vector and x-vector based speaker verification systems. In this paper, we describe an Expected Vector approach to obtain a vector from multiple enrollment utterances. Expected Vector approach uses a trained PLDA model to compute the expected class center given a set of vectors for that particular PLDA model. By using such an approach, a more meaningful class center representation can be obtained. This vector can be used to score a trial using two-vector scoring formula for a given PLDA model. We compare the performance of the proposed approach with various heuristic approaches and show that it provides significant improvements in terms of Equal Error Rate (EER) and minimum Detection Cost Function (minDCF). We show our results on x-vector system trained on Voxceleb dataset with various implementations of PLDA and trials designed on Voxceleb and Librispeech dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)

自引率

0.00%

发文量