{"title":"基于改进训练和分类的语言识别","authors":"E. Partovi, S. Ahadi, N. Faraji","doi":"10.1109/IRANIANCEE.2015.7146206","DOIUrl":null,"url":null,"abstract":"Language identification is an automatic process of detecting the language of a speech utterance. As an application example, in automatic translation technology, before any recognition or translation, the spoken language must be recognized. In this paper we propose two methods for training and testing the language classifiers. One uses Kullback Leibler Divergence (KLD) for improved training of GMMs and the other is the use of Frame Selection Decoding (FSD) for classification. The resulting system leads to significant improvement over the baseline system. Here, acoustic features are extracted directly from speech, and in order to add temporal variations, delta and shifted delta cepstral parameters are added to the features. Our approach has led to a language identification performance of 78.6% among 11 languages using the OGI database and relative reduction error rate of 27.95% when compared with a baseline system employing GMM-UBM for classification.","PeriodicalId":187121,"journal":{"name":"2015 23rd Iranian Conference on Electrical Engineering","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Language identification based on improved training and classification\",\"authors\":\"E. Partovi, S. Ahadi, N. Faraji\",\"doi\":\"10.1109/IRANIANCEE.2015.7146206\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Language identification is an automatic process of detecting the language of a speech utterance. As an application example, in automatic translation technology, before any recognition or translation, the spoken language must be recognized. In this paper we propose two methods for training and testing the language classifiers. One uses Kullback Leibler Divergence (KLD) for improved training of GMMs and the other is the use of Frame Selection Decoding (FSD) for classification. The resulting system leads to significant improvement over the baseline system. Here, acoustic features are extracted directly from speech, and in order to add temporal variations, delta and shifted delta cepstral parameters are added to the features. Our approach has led to a language identification performance of 78.6% among 11 languages using the OGI database and relative reduction error rate of 27.95% when compared with a baseline system employing GMM-UBM for classification.\",\"PeriodicalId\":187121,\"journal\":{\"name\":\"2015 23rd Iranian Conference on Electrical Engineering\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 23rd Iranian Conference on Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRANIANCEE.2015.7146206\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd Iranian Conference on Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRANIANCEE.2015.7146206","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Language identification based on improved training and classification
Language identification is an automatic process of detecting the language of a speech utterance. As an application example, in automatic translation technology, before any recognition or translation, the spoken language must be recognized. In this paper we propose two methods for training and testing the language classifiers. One uses Kullback Leibler Divergence (KLD) for improved training of GMMs and the other is the use of Frame Selection Decoding (FSD) for classification. The resulting system leads to significant improvement over the baseline system. Here, acoustic features are extracted directly from speech, and in order to add temporal variations, delta and shifted delta cepstral parameters are added to the features. Our approach has led to a language identification performance of 78.6% among 11 languages using the OGI database and relative reduction error rate of 27.95% when compared with a baseline system employing GMM-UBM for classification.