{"title":"Reverberation and Noise Robust Feature Compensation Based on IMM","authors":"C. Han, S. Kang, N. Kim","doi":"10.1109/TASL.2013.2256893","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel feature compensation approach based on the interacting multiple model (IMM) algorithm specially designed for joint processing of background noise and acoustic reverberation. Our approach to cope with the time-varying environmental parameters is to establish a switching linear dynamic model for the additive and convolutive distortions, such as the background noise and acoustic reverberation, in the log-spectral domain. We construct multiple state space models with the speech corruption process in which the log spectra of clean speech and log frequency response of acoustic reverberation are jointly handled as the state of our interest. The proposed approach shows significant improvements in the Aurora-5 automatic speech recognition (ASR) task which was developed to investigate the influence on the performance of ASR for a hands-free speech input in noisy room environments.","PeriodicalId":55014,"journal":{"name":"IEEE Transactions on Audio Speech and Language Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TASL.2013.2256893","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Audio Speech and Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TASL.2013.2256893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
In this paper, we propose a novel feature compensation approach based on the interacting multiple model (IMM) algorithm specially designed for joint processing of background noise and acoustic reverberation. Our approach to cope with the time-varying environmental parameters is to establish a switching linear dynamic model for the additive and convolutive distortions, such as the background noise and acoustic reverberation, in the log-spectral domain. We construct multiple state space models with the speech corruption process in which the log spectra of clean speech and log frequency response of acoustic reverberation are jointly handled as the state of our interest. The proposed approach shows significant improvements in the Aurora-5 automatic speech recognition (ASR) task which was developed to investigate the influence on the performance of ASR for a hands-free speech input in noisy room environments.
期刊介绍:
The IEEE Transactions on Audio, Speech and Language Processing covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language. In particular, audio processing also covers auditory modeling, acoustic modeling and source separation. Speech processing also covers speech production and perception, adaptation, lexical modeling and speaker recognition. Language processing also covers spoken language understanding, translation, summarization, mining, general language modeling, as well as spoken dialog systems.