针对说话人验证的音素依赖性会话间变异性减少

Haoze Lu, Wenbin Zhang, Y. Horiuchi, S. Kuroiwa
{"title":"针对说话人验证的音素依赖性会话间变异性减少","authors":"Haoze Lu, Wenbin Zhang, Y. Horiuchi, S. Kuroiwa","doi":"10.1504/IJBM.2015.070922","DOIUrl":null,"url":null,"abstract":"GMM-UBM super-vectors will potentially lead to worse modelling for speaker verification due to the inter-session variability, especially when a small amount of training utterances were available. In this study, we propose a phoneme dependent method to suppress the inter-session variability. A speaker's model can be represented by several various phoneme Gaussian mixture models. Each of them covers an individual phoneme whose inter-session variability can be constrained in an inter-session independent subspace constructed by principal component analysis PCA, and it uses corpus uttered by a single speaker that has been recorded over a long period. SVM-based experiments performed using a large corpus, constructed by the National Research Institute of Police Science NRIPS to evaluate Japanese speaker recognition, and demonstrate the improvements gained from the proposed method.","PeriodicalId":262486,"journal":{"name":"Int. J. Biom.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Phoneme dependent inter-session variability reduction for speaker verification\",\"authors\":\"Haoze Lu, Wenbin Zhang, Y. Horiuchi, S. Kuroiwa\",\"doi\":\"10.1504/IJBM.2015.070922\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"GMM-UBM super-vectors will potentially lead to worse modelling for speaker verification due to the inter-session variability, especially when a small amount of training utterances were available. In this study, we propose a phoneme dependent method to suppress the inter-session variability. A speaker's model can be represented by several various phoneme Gaussian mixture models. Each of them covers an individual phoneme whose inter-session variability can be constrained in an inter-session independent subspace constructed by principal component analysis PCA, and it uses corpus uttered by a single speaker that has been recorded over a long period. SVM-based experiments performed using a large corpus, constructed by the National Research Institute of Police Science NRIPS to evaluate Japanese speaker recognition, and demonstrate the improvements gained from the proposed method.\",\"PeriodicalId\":262486,\"journal\":{\"name\":\"Int. J. Biom.\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Biom.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJBM.2015.070922\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Biom.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJBM.2015.070922","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于会话间的可变性,GMM-UBM超级向量可能会导致说话人验证的糟糕建模,特别是当可用的训练话语数量很少时。在这项研究中,我们提出了一种音素依赖的方法来抑制会话间变异。一个说话人的模型可以用几个不同的音素高斯混合模型来表示。每个音素都包含一个单独的音素,这些音素的会话间变异性可以被约束在一个由主成分分析PCA构建的会话间独立子空间中,并且它使用单个说话人长时间录制的语料库。基于支持向量机的实验使用由国家警察科学研究所NRIPS构建的大型语料库来评估日语说话人识别,并证明了该方法的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Phoneme dependent inter-session variability reduction for speaker verification
GMM-UBM super-vectors will potentially lead to worse modelling for speaker verification due to the inter-session variability, especially when a small amount of training utterances were available. In this study, we propose a phoneme dependent method to suppress the inter-session variability. A speaker's model can be represented by several various phoneme Gaussian mixture models. Each of them covers an individual phoneme whose inter-session variability can be constrained in an inter-session independent subspace constructed by principal component analysis PCA, and it uses corpus uttered by a single speaker that has been recorded over a long period. SVM-based experiments performed using a large corpus, constructed by the National Research Institute of Police Science NRIPS to evaluate Japanese speaker recognition, and demonstrate the improvements gained from the proposed method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信