{"title":"Noisy channel adaptation in language identification","authors":"Sriram Ganapathy, M. Omar, Jason W. Pelecanos","doi":"10.1109/SLT.2012.6424241","DOIUrl":null,"url":null,"abstract":"Language identification (LID) of speech data recorded over noisy communication channels is a challenging problem especially when the LID system is tested on speech data from an unseen communication channel (not seen in training). In this paper, we consider the scenario in which a small amount of adaptation data is available from a new communication channel. Various approaches are investigated for efficient utilization of the adaptation data in a supervised as well as unsupervised setting. In a supervised adaptation framework, we show that support vector machines (SVMs) with higher order polynomial kernels (HO-SVM) trained using lower dimensional representations of the the Gaussian mixture model supervectors (GSVs) provide significant performance improvements over the baseline SVM-GSV system. In these LID experiments, we obtain 30% reduction in error-rate with 6 hours of adaptation data for a new channel. For unsupervised adaptation, we develop an iterative procedure for re-labeling the development data using a co-training framework. In these experiments, we obtain considerable improvements(relative improvements of 13 %) over a self-training framework with the HO-SVM models.","PeriodicalId":375378,"journal":{"name":"2012 IEEE Spoken Language Technology Workshop (SLT)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2012.6424241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Language identification (LID) of speech data recorded over noisy communication channels is a challenging problem especially when the LID system is tested on speech data from an unseen communication channel (not seen in training). In this paper, we consider the scenario in which a small amount of adaptation data is available from a new communication channel. Various approaches are investigated for efficient utilization of the adaptation data in a supervised as well as unsupervised setting. In a supervised adaptation framework, we show that support vector machines (SVMs) with higher order polynomial kernels (HO-SVM) trained using lower dimensional representations of the the Gaussian mixture model supervectors (GSVs) provide significant performance improvements over the baseline SVM-GSV system. In these LID experiments, we obtain 30% reduction in error-rate with 6 hours of adaptation data for a new channel. For unsupervised adaptation, we develop an iterative procedure for re-labeling the development data using a co-training framework. In these experiments, we obtain considerable improvements(relative improvements of 13 %) over a self-training framework with the HO-SVM models.