语音频谱的有界支持高斯混合建模

IEEE Trans. Speech Audio Process. Pub Date : 2003-02-19 DOI:10.1109/TSA.2002.805639

J. Lindblom, J. Samuelsson

{"title":"语音频谱的有界支持高斯混合建模","authors":"J. Lindblom, J. Samuelsson","doi":"10.1109/TSA.2002.805639","DOIUrl":null,"url":null,"abstract":"Lately, Gaussian mixture (GM) models have found new applications in speech processing, and particularly in speech coding. This paper provides a review of GM based quantization and prediction. The main contribution is a discussion on GM model optimization. Two previously presented algorithms of EM-type are analyzed in some detail, and models are estimated and evaluated experimentally using theoretical measures as well as GM based speech spectrum coding and prediction. It has been argued that since many sources have a bounded support, this should be utilized in both the choice of model, and the optimization algorithm. By low-dimensional modeling examples, illustrating the behavior of the two algorithms graphically, and by full-scale evaluation of GM based systems, the advantages of a bounded support approach are quantified. For all evaluation techniques in the study, model accuracy is improved when the bounded support approach is adopted. The gains are typically largest for models with diagonal covariance matrices.","PeriodicalId":13155,"journal":{"name":"IEEE Trans. Speech Audio Process.","volume":"9 1","pages":"88-99"},"PeriodicalIF":0.0000,"publicationDate":"2003-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"66","resultStr":"{\"title\":\"Bounded support Gaussian mixture modeling of speech spectra\",\"authors\":\"J. Lindblom, J. Samuelsson\",\"doi\":\"10.1109/TSA.2002.805639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lately, Gaussian mixture (GM) models have found new applications in speech processing, and particularly in speech coding. This paper provides a review of GM based quantization and prediction. The main contribution is a discussion on GM model optimization. Two previously presented algorithms of EM-type are analyzed in some detail, and models are estimated and evaluated experimentally using theoretical measures as well as GM based speech spectrum coding and prediction. It has been argued that since many sources have a bounded support, this should be utilized in both the choice of model, and the optimization algorithm. By low-dimensional modeling examples, illustrating the behavior of the two algorithms graphically, and by full-scale evaluation of GM based systems, the advantages of a bounded support approach are quantified. For all evaluation techniques in the study, model accuracy is improved when the bounded support approach is adopted. The gains are typically largest for models with diagonal covariance matrices.\",\"PeriodicalId\":13155,\"journal\":{\"name\":\"IEEE Trans. Speech Audio Process.\",\"volume\":\"9 1\",\"pages\":\"88-99\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"66\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Trans. Speech Audio Process.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TSA.2002.805639\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Trans. Speech Audio Process.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TSA.2002.805639","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 66

摘要

近年来，高斯混合(GM)模型在语音处理，尤其是语音编码方面有了新的应用。本文综述了基于遗传算法的量化和预测方法。主要贡献是对GM模型的优化问题进行了讨论。对已有的两种基于GM的语音频谱编码和预测算法进行了详细的分析，并利用理论度量和基于GM的语音频谱编码和预测对模型进行了实验估计和评价。有人认为，由于许多源具有有限支持，因此在选择模型和优化算法时都应利用这一点。通过低维建模实例，图解地说明了两种算法的行为，并通过基于GM的系统的全面评估，量化了有界支持方法的优点。在研究的所有评估技术中，采用有界支持方法可以提高模型精度。对于具有对角协方差矩阵的模型，增益通常是最大的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Bounded support Gaussian mixture modeling of speech spectra

Lately, Gaussian mixture (GM) models have found new applications in speech processing, and particularly in speech coding. This paper provides a review of GM based quantization and prediction. The main contribution is a discussion on GM model optimization. Two previously presented algorithms of EM-type are analyzed in some detail, and models are estimated and evaluated experimentally using theoretical measures as well as GM based speech spectrum coding and prediction. It has been argued that since many sources have a bounded support, this should be utilized in both the choice of model, and the optimization algorithm. By low-dimensional modeling examples, illustrating the behavior of the two algorithms graphically, and by full-scale evaluation of GM based systems, the advantages of a bounded support approach are quantified. For all evaluation techniques in the study, model accuracy is improved when the bounded support approach is adopted. The gains are typically largest for models with diagonal covariance matrices.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Trans. Speech Audio Process.

自引率

0.00%

发文量