{"title":"基于KSVD的声学聚类快速在线自适应","authors":"S. Shahnawazuddin, R. Sinha","doi":"10.1109/INDCON.2013.6725938","DOIUrl":null,"url":null,"abstract":"In this work, the issues of on-line adaptation for real-time applications are addressed. In such systems, unsupervised adaptation has to performed with a very small amount of adaptation data. Furthermore, in such tasks, the computational complexity involved should be as low as possible to keep the system latency in check. To address both these issues, a model interpolation based fast adaptation procedure, employing speaker cluster models as bases, is presented in this work. It is observed that the acoustic clustering of the training speakers to derive the bases greatly reduces the complexity in comparison to the techniques which employ speaker adapted models as bases. Apart from this, a KSVD based acoustic clustering scheme is also proposed. Acoustic clustering in supervised as well unsupervised mode is explored in this work. The proposed on-line adaptation procedure employing the KSVD clustering, is found to result in a relative improvement of 6% in WER on an LVCSR task.","PeriodicalId":313185,"journal":{"name":"2013 Annual IEEE India Conference (INDICON)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Fast on-line adaptation using KSVD based acoustic clustering\",\"authors\":\"S. Shahnawazuddin, R. Sinha\",\"doi\":\"10.1109/INDCON.2013.6725938\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, the issues of on-line adaptation for real-time applications are addressed. In such systems, unsupervised adaptation has to performed with a very small amount of adaptation data. Furthermore, in such tasks, the computational complexity involved should be as low as possible to keep the system latency in check. To address both these issues, a model interpolation based fast adaptation procedure, employing speaker cluster models as bases, is presented in this work. It is observed that the acoustic clustering of the training speakers to derive the bases greatly reduces the complexity in comparison to the techniques which employ speaker adapted models as bases. Apart from this, a KSVD based acoustic clustering scheme is also proposed. Acoustic clustering in supervised as well unsupervised mode is explored in this work. The proposed on-line adaptation procedure employing the KSVD clustering, is found to result in a relative improvement of 6% in WER on an LVCSR task.\",\"PeriodicalId\":313185,\"journal\":{\"name\":\"2013 Annual IEEE India Conference (INDICON)\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Annual IEEE India Conference (INDICON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INDCON.2013.6725938\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Annual IEEE India Conference (INDICON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDCON.2013.6725938","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fast on-line adaptation using KSVD based acoustic clustering
In this work, the issues of on-line adaptation for real-time applications are addressed. In such systems, unsupervised adaptation has to performed with a very small amount of adaptation data. Furthermore, in such tasks, the computational complexity involved should be as low as possible to keep the system latency in check. To address both these issues, a model interpolation based fast adaptation procedure, employing speaker cluster models as bases, is presented in this work. It is observed that the acoustic clustering of the training speakers to derive the bases greatly reduces the complexity in comparison to the techniques which employ speaker adapted models as bases. Apart from this, a KSVD based acoustic clustering scheme is also proposed. Acoustic clustering in supervised as well unsupervised mode is explored in this work. The proposed on-line adaptation procedure employing the KSVD clustering, is found to result in a relative improvement of 6% in WER on an LVCSR task.