{"title":"基于网格的cmlr自适应语音识别系统及其实现问题","authors":"C. Leung, R. Tong, B. Ma, Haizhou Li","doi":"10.1109/IALP.2009.67","DOIUrl":null,"url":null,"abstract":"This paper presents a “non-complicated” automatic spoken language recognition system which can be effectively implemented using publicly available toolkits (such as HTK, SRILM and SVM-Light) and corpus resources (such as Switchboard, CallFriend, OHSU and NIST LRE07 speech corpora). This system involves two context-independent phone recognizers, a vector space modelling classifier and an equal weight fusion of likelihood scores from the classifier. CMLLR adaptation and phone lattice are also used in this system. Our experiments show that these two techniques are essential in obvious performance improvement. Despite the simplicity of the system, it achieves the EER of 2.72% in the 30-sec condition in NIST LRE-2007 evaluation data set. Moreover, we describe our experience how we use the large amount of available training data to effectively test different configurations in the phone recognizers. This practical issue should be interesting to the later comers who plan to participate in NIST Language Recognition evaluation or similar international benchmark campaigns.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A Lattice-Based Phonotactic Language Recognition System with CMLLR Adaptation and Its Implementation Issues\",\"authors\":\"C. Leung, R. Tong, B. Ma, Haizhou Li\",\"doi\":\"10.1109/IALP.2009.67\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a “non-complicated” automatic spoken language recognition system which can be effectively implemented using publicly available toolkits (such as HTK, SRILM and SVM-Light) and corpus resources (such as Switchboard, CallFriend, OHSU and NIST LRE07 speech corpora). This system involves two context-independent phone recognizers, a vector space modelling classifier and an equal weight fusion of likelihood scores from the classifier. CMLLR adaptation and phone lattice are also used in this system. Our experiments show that these two techniques are essential in obvious performance improvement. Despite the simplicity of the system, it achieves the EER of 2.72% in the 30-sec condition in NIST LRE-2007 evaluation data set. Moreover, we describe our experience how we use the large amount of available training data to effectively test different configurations in the phone recognizers. This practical issue should be interesting to the later comers who plan to participate in NIST Language Recognition evaluation or similar international benchmark campaigns.\",\"PeriodicalId\":156840,\"journal\":{\"name\":\"2009 International Conference on Asian Language Processing\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference on Asian Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IALP.2009.67\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Asian Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2009.67","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Lattice-Based Phonotactic Language Recognition System with CMLLR Adaptation and Its Implementation Issues
This paper presents a “non-complicated” automatic spoken language recognition system which can be effectively implemented using publicly available toolkits (such as HTK, SRILM and SVM-Light) and corpus resources (such as Switchboard, CallFriend, OHSU and NIST LRE07 speech corpora). This system involves two context-independent phone recognizers, a vector space modelling classifier and an equal weight fusion of likelihood scores from the classifier. CMLLR adaptation and phone lattice are also used in this system. Our experiments show that these two techniques are essential in obvious performance improvement. Despite the simplicity of the system, it achieves the EER of 2.72% in the 30-sec condition in NIST LRE-2007 evaluation data set. Moreover, we describe our experience how we use the large amount of available training data to effectively test different configurations in the phone recognizers. This practical issue should be interesting to the later comers who plan to participate in NIST Language Recognition evaluation or similar international benchmark campaigns.