基于改进训练和分类的语言识别

2015 23rd Iranian Conference on Electrical Engineering Pub Date : 2015-07-02 DOI:10.1109/IRANIANCEE.2015.7146206

E. Partovi, S. Ahadi, N. Faraji

{"title":"基于改进训练和分类的语言识别","authors":"E. Partovi, S. Ahadi, N. Faraji","doi":"10.1109/IRANIANCEE.2015.7146206","DOIUrl":null,"url":null,"abstract":"Language identification is an automatic process of detecting the language of a speech utterance. As an application example, in automatic translation technology, before any recognition or translation, the spoken language must be recognized. In this paper we propose two methods for training and testing the language classifiers. One uses Kullback Leibler Divergence (KLD) for improved training of GMMs and the other is the use of Frame Selection Decoding (FSD) for classification. The resulting system leads to significant improvement over the baseline system. Here, acoustic features are extracted directly from speech, and in order to add temporal variations, delta and shifted delta cepstral parameters are added to the features. Our approach has led to a language identification performance of 78.6% among 11 languages using the OGI database and relative reduction error rate of 27.95% when compared with a baseline system employing GMM-UBM for classification.","PeriodicalId":187121,"journal":{"name":"2015 23rd Iranian Conference on Electrical Engineering","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Language identification based on improved training and classification\",\"authors\":\"E. Partovi, S. Ahadi, N. Faraji\",\"doi\":\"10.1109/IRANIANCEE.2015.7146206\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Language identification is an automatic process of detecting the language of a speech utterance. As an application example, in automatic translation technology, before any recognition or translation, the spoken language must be recognized. In this paper we propose two methods for training and testing the language classifiers. One uses Kullback Leibler Divergence (KLD) for improved training of GMMs and the other is the use of Frame Selection Decoding (FSD) for classification. The resulting system leads to significant improvement over the baseline system. Here, acoustic features are extracted directly from speech, and in order to add temporal variations, delta and shifted delta cepstral parameters are added to the features. Our approach has led to a language identification performance of 78.6% among 11 languages using the OGI database and relative reduction error rate of 27.95% when compared with a baseline system employing GMM-UBM for classification.\",\"PeriodicalId\":187121,\"journal\":{\"name\":\"2015 23rd Iranian Conference on Electrical Engineering\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 23rd Iranian Conference on Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRANIANCEE.2015.7146206\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd Iranian Conference on Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRANIANCEE.2015.7146206","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

语言识别是一个自动检测语音话语语言的过程。作为一个应用实例，在自动翻译技术中，在进行任何识别或翻译之前，必须对口语进行识别。本文提出了两种训练和测试语言分类器的方法。一种是使用Kullback Leibler Divergence (KLD)改进gmm的训练，另一种是使用帧选择解码(FSD)进行分类。由此产生的系统比基线系统有了显著的改进。在这里，声学特征是直接从语音中提取的，为了增加时间变化，在特征中加入了delta和移位的delta倒谱参数。与使用GMM-UBM进行分类的基线系统相比，我们的方法在使用OGI数据库的11种语言中实现了78.6%的语言识别性能，相对减少错误率为27.95%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Language identification based on improved training and classification

Language identification is an automatic process of detecting the language of a speech utterance. As an application example, in automatic translation technology, before any recognition or translation, the spoken language must be recognized. In this paper we propose two methods for training and testing the language classifiers. One uses Kullback Leibler Divergence (KLD) for improved training of GMMs and the other is the use of Frame Selection Decoding (FSD) for classification. The resulting system leads to significant improvement over the baseline system. Here, acoustic features are extracted directly from speech, and in order to add temporal variations, delta and shifted delta cepstral parameters are added to the features. Our approach has led to a language identification performance of 78.6% among 11 languages using the OGI database and relative reduction error rate of 27.95% when compared with a baseline system employing GMM-UBM for classification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 23rd Iranian Conference on Electrical Engineering

自引率

0.00%

发文量