{"title":"Language identification using discriminative weighted language models","authors":"Shizhen Wang, Jia Liu, Runsheng Liu","doi":"10.1109/CHINSL.2004.1409584","DOIUrl":null,"url":null,"abstract":"In this paper, discriminative weighted language models are proposed to better distinguish between similar languages. Through parallel phone recognizers followed by language modeling (PPRLM) system in the first stage, two best candidates are hypothesized and then processed using discriminative language models. Experimental results show that, compared with the traditional one-pass language identification (LID) systems, the proposed two-pass method can greatly improve the performance without considerably increasing the computational costs. Tested on the evaluation set of the CallFriend corpus, the final system achieved an error rate of 14.90% on the 30s 12-way close-set task.","PeriodicalId":212562,"journal":{"name":"2004 International Symposium on Chinese Spoken Language Processing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CHINSL.2004.1409584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, discriminative weighted language models are proposed to better distinguish between similar languages. Through parallel phone recognizers followed by language modeling (PPRLM) system in the first stage, two best candidates are hypothesized and then processed using discriminative language models. Experimental results show that, compared with the traditional one-pass language identification (LID) systems, the proposed two-pass method can greatly improve the performance without considerably increasing the computational costs. Tested on the evaluation set of the CallFriend corpus, the final system achieved an error rate of 14.90% on the 30s 12-way close-set task.