针对说话人验证的音素依赖性会话间变异性减少

Int. J. Biom. Pub Date : 2015-07-01 DOI:10.1504/IJBM.2015.070922

Haoze Lu, Wenbin Zhang, Y. Horiuchi, S. Kuroiwa

{"title":"针对说话人验证的音素依赖性会话间变异性减少","authors":"Haoze Lu, Wenbin Zhang, Y. Horiuchi, S. Kuroiwa","doi":"10.1504/IJBM.2015.070922","DOIUrl":null,"url":null,"abstract":"GMM-UBM super-vectors will potentially lead to worse modelling for speaker verification due to the inter-session variability, especially when a small amount of training utterances were available. In this study, we propose a phoneme dependent method to suppress the inter-session variability. A speaker's model can be represented by several various phoneme Gaussian mixture models. Each of them covers an individual phoneme whose inter-session variability can be constrained in an inter-session independent subspace constructed by principal component analysis PCA, and it uses corpus uttered by a single speaker that has been recorded over a long period. SVM-based experiments performed using a large corpus, constructed by the National Research Institute of Police Science NRIPS to evaluate Japanese speaker recognition, and demonstrate the improvements gained from the proposed method.","PeriodicalId":262486,"journal":{"name":"Int. J. Biom.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Phoneme dependent inter-session variability reduction for speaker verification\",\"authors\":\"Haoze Lu, Wenbin Zhang, Y. Horiuchi, S. Kuroiwa\",\"doi\":\"10.1504/IJBM.2015.070922\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"GMM-UBM super-vectors will potentially lead to worse modelling for speaker verification due to the inter-session variability, especially when a small amount of training utterances were available. In this study, we propose a phoneme dependent method to suppress the inter-session variability. A speaker's model can be represented by several various phoneme Gaussian mixture models. Each of them covers an individual phoneme whose inter-session variability can be constrained in an inter-session independent subspace constructed by principal component analysis PCA, and it uses corpus uttered by a single speaker that has been recorded over a long period. SVM-based experiments performed using a large corpus, constructed by the National Research Institute of Police Science NRIPS to evaluate Japanese speaker recognition, and demonstrate the improvements gained from the proposed method.\",\"PeriodicalId\":262486,\"journal\":{\"name\":\"Int. J. Biom.\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Biom.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJBM.2015.070922\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Biom.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJBM.2015.070922","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

由于会话间的可变性，GMM-UBM超级向量可能会导致说话人验证的糟糕建模，特别是当可用的训练话语数量很少时。在这项研究中，我们提出了一种音素依赖的方法来抑制会话间变异。一个说话人的模型可以用几个不同的音素高斯混合模型来表示。每个音素都包含一个单独的音素，这些音素的会话间变异性可以被约束在一个由主成分分析PCA构建的会话间独立子空间中，并且它使用单个说话人长时间录制的语料库。基于支持向量机的实验使用由国家警察科学研究所NRIPS构建的大型语料库来评估日语说话人识别，并证明了该方法的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Phoneme dependent inter-session variability reduction for speaker verification

GMM-UBM super-vectors will potentially lead to worse modelling for speaker verification due to the inter-session variability, especially when a small amount of training utterances were available. In this study, we propose a phoneme dependent method to suppress the inter-session variability. A speaker's model can be represented by several various phoneme Gaussian mixture models. Each of them covers an individual phoneme whose inter-session variability can be constrained in an inter-session independent subspace constructed by principal component analysis PCA, and it uses corpus uttered by a single speaker that has been recorded over a long period. SVM-based experiments performed using a large corpus, constructed by the National Research Institute of Police Science NRIPS to evaluate Japanese speaker recognition, and demonstrate the improvements gained from the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Int. J. Biom.

自引率

0.00%

发文量