基于改进训练和分类的语言识别

E. Partovi, S. Ahadi, N. Faraji
{"title":"基于改进训练和分类的语言识别","authors":"E. Partovi, S. Ahadi, N. Faraji","doi":"10.1109/IRANIANCEE.2015.7146206","DOIUrl":null,"url":null,"abstract":"Language identification is an automatic process of detecting the language of a speech utterance. As an application example, in automatic translation technology, before any recognition or translation, the spoken language must be recognized. In this paper we propose two methods for training and testing the language classifiers. One uses Kullback Leibler Divergence (KLD) for improved training of GMMs and the other is the use of Frame Selection Decoding (FSD) for classification. The resulting system leads to significant improvement over the baseline system. Here, acoustic features are extracted directly from speech, and in order to add temporal variations, delta and shifted delta cepstral parameters are added to the features. Our approach has led to a language identification performance of 78.6% among 11 languages using the OGI database and relative reduction error rate of 27.95% when compared with a baseline system employing GMM-UBM for classification.","PeriodicalId":187121,"journal":{"name":"2015 23rd Iranian Conference on Electrical Engineering","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Language identification based on improved training and classification\",\"authors\":\"E. Partovi, S. Ahadi, N. Faraji\",\"doi\":\"10.1109/IRANIANCEE.2015.7146206\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Language identification is an automatic process of detecting the language of a speech utterance. As an application example, in automatic translation technology, before any recognition or translation, the spoken language must be recognized. In this paper we propose two methods for training and testing the language classifiers. One uses Kullback Leibler Divergence (KLD) for improved training of GMMs and the other is the use of Frame Selection Decoding (FSD) for classification. The resulting system leads to significant improvement over the baseline system. Here, acoustic features are extracted directly from speech, and in order to add temporal variations, delta and shifted delta cepstral parameters are added to the features. Our approach has led to a language identification performance of 78.6% among 11 languages using the OGI database and relative reduction error rate of 27.95% when compared with a baseline system employing GMM-UBM for classification.\",\"PeriodicalId\":187121,\"journal\":{\"name\":\"2015 23rd Iranian Conference on Electrical Engineering\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 23rd Iranian Conference on Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRANIANCEE.2015.7146206\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd Iranian Conference on Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRANIANCEE.2015.7146206","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

语言识别是一个自动检测语音话语语言的过程。作为一个应用实例,在自动翻译技术中,在进行任何识别或翻译之前,必须对口语进行识别。本文提出了两种训练和测试语言分类器的方法。一种是使用Kullback Leibler Divergence (KLD)改进gmm的训练,另一种是使用帧选择解码(FSD)进行分类。由此产生的系统比基线系统有了显著的改进。在这里,声学特征是直接从语音中提取的,为了增加时间变化,在特征中加入了delta和移位的delta倒谱参数。与使用GMM-UBM进行分类的基线系统相比,我们的方法在使用OGI数据库的11种语言中实现了78.6%的语言识别性能,相对减少错误率为27.95%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Language identification based on improved training and classification
Language identification is an automatic process of detecting the language of a speech utterance. As an application example, in automatic translation technology, before any recognition or translation, the spoken language must be recognized. In this paper we propose two methods for training and testing the language classifiers. One uses Kullback Leibler Divergence (KLD) for improved training of GMMs and the other is the use of Frame Selection Decoding (FSD) for classification. The resulting system leads to significant improvement over the baseline system. Here, acoustic features are extracted directly from speech, and in order to add temporal variations, delta and shifted delta cepstral parameters are added to the features. Our approach has led to a language identification performance of 78.6% among 11 languages using the OGI database and relative reduction error rate of 27.95% when compared with a baseline system employing GMM-UBM for classification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信