{"title":"Joint Uncertainty Decoding for Noise Robust Subspace Gaussian Mixture Models","authors":"Liang Lu, K. K. Chin, Arnab Ghoshal, S. Renals","doi":"10.1109/TASL.2013.2248718","DOIUrl":null,"url":null,"abstract":"Joint uncertainty decoding (JUD) is a model-based noise compensation technique for conventional Gaussian Mixture Model (GMM) based speech recognition systems. Unlike vector Taylor series (VTS) compensation which operates on the individual Gaussian components in an acoustic model, JUD clusters the Gaussian components into a smaller number of classes, sharing the compensation parameters for the set of Gaussians in a given class. This significantly reduces the computational cost. In this paper, we investigate noise compensation for subspace Gaussian mixture model (SGMM) based speech recognition systems using JUD. The total number of Gaussian components in an SGMM is typically very large. Therefore direct compensation of the individual Gaussian components, as performed by VTS, is computationally expensive. In this paper we show that JUD-based noise compensation can be successfully applied to SGMMs in a computationally efficient way. We evaluate the JUD/SGMM technique on the standard Aurora 4 corpus. Our experimental results indicate that the JUD/SGMM system results in lower word error rates compared with a conventional GMM system with either VTS-based or JUD-based noise compensation.","PeriodicalId":55014,"journal":{"name":"IEEE Transactions on Audio Speech and Language Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TASL.2013.2248718","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Audio Speech and Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TASL.2013.2248718","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Joint uncertainty decoding (JUD) is a model-based noise compensation technique for conventional Gaussian Mixture Model (GMM) based speech recognition systems. Unlike vector Taylor series (VTS) compensation which operates on the individual Gaussian components in an acoustic model, JUD clusters the Gaussian components into a smaller number of classes, sharing the compensation parameters for the set of Gaussians in a given class. This significantly reduces the computational cost. In this paper, we investigate noise compensation for subspace Gaussian mixture model (SGMM) based speech recognition systems using JUD. The total number of Gaussian components in an SGMM is typically very large. Therefore direct compensation of the individual Gaussian components, as performed by VTS, is computationally expensive. In this paper we show that JUD-based noise compensation can be successfully applied to SGMMs in a computationally efficient way. We evaluate the JUD/SGMM technique on the standard Aurora 4 corpus. Our experimental results indicate that the JUD/SGMM system results in lower word error rates compared with a conventional GMM system with either VTS-based or JUD-based noise compensation.
期刊介绍:
The IEEE Transactions on Audio, Speech and Language Processing covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language. In particular, audio processing also covers auditory modeling, acoustic modeling and source separation. Speech processing also covers speech production and perception, adaptation, lexical modeling and speaker recognition. Language processing also covers spoken language understanding, translation, summarization, mining, general language modeling, as well as spoken dialog systems.