特征空间中的鲁棒说话人聚类

R. Faltlhauser, G. Ruske
{"title":"特征空间中的鲁棒说话人聚类","authors":"R. Faltlhauser, G. Ruske","doi":"10.1109/ASRU.2001.1034588","DOIUrl":null,"url":null,"abstract":"We propose a speaker clustering scheme working in 'eigenspace'. Speaker models are transformed to a low-dimensional subspace using 'eigenvoices'. For the speaker clustering procedure, simple distance measures, e.g. Euclidean distance, can be applied. Moreover, clustering can be accomplished with base models (for eigenvoice projection) like Gaussian mixture models as well as conventional HMMs. In case of HMMs, re-projection to the original space readily yields acoustic models. Clustering in subspace produces a well-balanced cluster and is easy to control. In the field of speaker adaptation, several principal techniques can be distinguished. The most prominent among them are Bayesian adaptation (e.g. MAP), transformation based approaches (MLLR - maximum likelihood linear regression), as well as so-called eigenspace techniques. Especially the latter have become increasingly popular, as they make use of a-priori information about the distribution of speaker models. The basic approach is commonly called the eigenvoice (EV) approach. Besides these techniques, speaker clustering is a further attractive adaptation scheme, especially since it can be - and has been - easily combined with the above methods.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"142 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Robust speaker clustering in eigenspace\",\"authors\":\"R. Faltlhauser, G. Ruske\",\"doi\":\"10.1109/ASRU.2001.1034588\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a speaker clustering scheme working in 'eigenspace'. Speaker models are transformed to a low-dimensional subspace using 'eigenvoices'. For the speaker clustering procedure, simple distance measures, e.g. Euclidean distance, can be applied. Moreover, clustering can be accomplished with base models (for eigenvoice projection) like Gaussian mixture models as well as conventional HMMs. In case of HMMs, re-projection to the original space readily yields acoustic models. Clustering in subspace produces a well-balanced cluster and is easy to control. In the field of speaker adaptation, several principal techniques can be distinguished. The most prominent among them are Bayesian adaptation (e.g. MAP), transformation based approaches (MLLR - maximum likelihood linear regression), as well as so-called eigenspace techniques. Especially the latter have become increasingly popular, as they make use of a-priori information about the distribution of speaker models. The basic approach is commonly called the eigenvoice (EV) approach. Besides these techniques, speaker clustering is a further attractive adaptation scheme, especially since it can be - and has been - easily combined with the above methods.\",\"PeriodicalId\":118671,\"journal\":{\"name\":\"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.\",\"volume\":\"142 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2001.1034588\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2001.1034588","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

摘要

我们提出了一种在“特征空间”中工作的说话人聚类方案。使用“特征语音”将说话人模型转换为低维子空间。对于说话人聚类过程,可以使用简单的距离度量,例如欧几里得距离。此外,聚类可以用基本模型(用于特征语音投影)如高斯混合模型和传统hmm来完成。在hmm的情况下,重新投影到原始空间很容易产生声学模型。子空间中的聚类产生一个平衡良好的聚类,并且易于控制。在说话人自适应领域,可以区分出几种主要的技术。其中最突出的是贝叶斯自适应(例如MAP),基于变换的方法(MLLR -最大似然线性回归),以及所谓的特征空间技术。尤其是后者已经变得越来越流行,因为它们利用了关于说话人模型分布的先验信息。基本方法通常被称为特征语音(EV)方法。除了这些技术之外,说话人聚类是另一种有吸引力的自适应方案,特别是因为它可以很容易地与上述方法相结合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Robust speaker clustering in eigenspace
We propose a speaker clustering scheme working in 'eigenspace'. Speaker models are transformed to a low-dimensional subspace using 'eigenvoices'. For the speaker clustering procedure, simple distance measures, e.g. Euclidean distance, can be applied. Moreover, clustering can be accomplished with base models (for eigenvoice projection) like Gaussian mixture models as well as conventional HMMs. In case of HMMs, re-projection to the original space readily yields acoustic models. Clustering in subspace produces a well-balanced cluster and is easy to control. In the field of speaker adaptation, several principal techniques can be distinguished. The most prominent among them are Bayesian adaptation (e.g. MAP), transformation based approaches (MLLR - maximum likelihood linear regression), as well as so-called eigenspace techniques. Especially the latter have become increasingly popular, as they make use of a-priori information about the distribution of speaker models. The basic approach is commonly called the eigenvoice (EV) approach. Besides these techniques, speaker clustering is a further attractive adaptation scheme, especially since it can be - and has been - easily combined with the above methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信