{"title":"A non-linear model transformation for ML stochastic matching in additive noise","authors":"S. Wong, Bertram E. Shi","doi":"10.1109/MMSP.1998.738926","DOIUrl":null,"url":null,"abstract":"We present a non-linear model transformation for adapting Gaussian mixture HMMs using both static and dynamic MFCC observation vectors to the presence of additive noise. This transformation depends upon a few compensation coefficients which can be estimated from a short training token of noise. Alternatively, one can also apply maximum-likelihood stochastic matching to estimate the compensation coefficients from speech embedded in noise. This can eliminate the need for segmentation of pure noise from speech for the estimation and can also compensate for inaccuracies in the estimation of the compensation coefficients as well as those due to the approximations used in deriving the transformation.","PeriodicalId":180426,"journal":{"name":"1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.1998.738926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
We present a non-linear model transformation for adapting Gaussian mixture HMMs using both static and dynamic MFCC observation vectors to the presence of additive noise. This transformation depends upon a few compensation coefficients which can be estimated from a short training token of noise. Alternatively, one can also apply maximum-likelihood stochastic matching to estimate the compensation coefficients from speech embedded in noise. This can eliminate the need for segmentation of pure noise from speech for the estimation and can also compensate for inaccuracies in the estimation of the compensation coefficients as well as those due to the approximations used in deriving the transformation.