参数空间、模型和度量的有效结合

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI:10.1109/ASRU.2007.4430120

Themos Stafylakis, V. Katsouros, G. Carayannis

{"title":"参数空间、模型和度量的有效结合","authors":"Themos Stafylakis, V. Katsouros, G. Carayannis","doi":"10.1109/ASRU.2007.4430120","DOIUrl":null,"url":null,"abstract":"In this paper we present a method of combining several acoustic parametric spaces, statistical models and distance metrics in speaker diarization task. Focusing our interest on the post-segmentation part of the problem, we adopt an incremental feature selection and fusion algorithm based on the Maximum Entropy Principle and Iterative Scaling Algorithm that combines several statistical distance measures on speech-chunk pairs. By this approach, we place the merging-of-chunks clustering process into a probabilistic framework. We also propose a decomposition of the input space according to gender, recording conditions and chunk lengths. The algorithm produced highly competitive results compared to GMM-UBM state-of-the-art methods.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient combination of parametric spaces, models and metrics for speaker diarization1\",\"authors\":\"Themos Stafylakis, V. Katsouros, G. Carayannis\",\"doi\":\"10.1109/ASRU.2007.4430120\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we present a method of combining several acoustic parametric spaces, statistical models and distance metrics in speaker diarization task. Focusing our interest on the post-segmentation part of the problem, we adopt an incremental feature selection and fusion algorithm based on the Maximum Entropy Principle and Iterative Scaling Algorithm that combines several statistical distance measures on speech-chunk pairs. By this approach, we place the merging-of-chunks clustering process into a probabilistic framework. We also propose a decomposition of the input space according to gender, recording conditions and chunk lengths. The algorithm produced highly competitive results compared to GMM-UBM state-of-the-art methods.\",\"PeriodicalId\":371729,\"journal\":{\"name\":\"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2007.4430120\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2007.4430120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种将几种声学参数空间、统计模型和距离度量相结合的方法。针对问题的后分割部分，我们采用了一种基于最大熵原理和迭代缩放算法的增量特征选择和融合算法，该算法结合了语音块对的几种统计距离度量。通过这种方法，我们将块合并聚类过程置于概率框架中。我们还建议根据性别、记录条件和块长度对输入空间进行分解。与GMM-UBM最先进的方法相比，该算法产生了极具竞争力的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficient combination of parametric spaces, models and metrics for speaker diarization1

In this paper we present a method of combining several acoustic parametric spaces, statistical models and distance metrics in speaker diarization task. Focusing our interest on the post-segmentation part of the problem, we adopt an incremental feature selection and fusion algorithm based on the Maximum Entropy Principle and Iterative Scaling Algorithm that combines several statistical distance measures on speech-chunk pairs. By this approach, we place the merging-of-chunks clustering process into a probabilistic framework. We also propose a decomposition of the input space according to gender, recording conditions and chunk lengths. The algorithm produced highly competitive results compared to GMM-UBM state-of-the-art methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

自引率

0.00%

发文量