IEEE Trans. Speech Audio Process.最新文献_第7页

Blind single channel deconvolution using nonstationary signal processing 采用非平稳信号处理的盲单通道反卷积

IEEE Trans. Speech Audio Process. Pub Date : 2003-08-26 DOI: 10.1109/TSA.2003.815522

J. Hopgood, P. Rayner

{"title":"Blind single channel deconvolution using nonstationary signal processing","authors":"J. Hopgood, P. Rayner","doi":"10.1109/TSA.2003.815522","DOIUrl":"https://doi.org/10.1109/TSA.2003.815522","url":null,"abstract":"Blind deconvolution is fundamental in signal processing applications and, in particular, the single channel case remains a challenging and formidable problem. This paper considers single channel blind deconvolution in the case where the degraded observed signal may be modeled as the convolution of a nonstationary source signal with a stationary distortion operator. The important feature that the source is nonstationary while the channel is stationary facilitates the unambiguous identification of either the source or channel, and deconvolution is possible, whereas if the source and channel are both stationary, identification is ambiguous. The parameters for the channel are estimated by modeling the source as a time-varyng AR process and the distortion by an all-pole filter, and using the Bayesian framework for parameter estimation. This estimate can then be used to deconvolve the observed signal. In contrast to the classical histogram approach for estimating the channel poles, where the technique merely relies on the fact that the channel is actually stationary rather than modeling it as so, the proposed Bayesian method does take account for the channel's stationarity in the model and, consequently, is more robust. The properties of this model are investigated, and the advantage of utilizing the nonstationarity of a system rather than considering it as a curse is discussed.","PeriodicalId":13155,"journal":{"name":"IEEE Trans. Speech Audio Process.","volume":"469 1","pages":"476-488"},"PeriodicalIF":0.0,"publicationDate":"2003-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77508201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

A new approach to utterance verification based on neighborhood information in model space 基于模型空间邻域信息的话语验证新方法

IEEE Trans. Speech Audio Process. Pub Date : 2003-08-26 DOI: 10.1109/TSA.2003.815821

Hui Jiang, Chin-Hui Lee

引用次数: 29

A perceptually motivated approach for speech enhancement 基于感知动机的语音增强方法

IEEE Trans. Speech Audio Process. Pub Date : 2003-08-26 DOI: 10.1109/TSA.2003.815936

Y. Hu, P. Loizou

引用次数: 99

Audio source separation of convolutive mixtures 卷积混合音频源分离

IEEE Trans. Speech Audio Process. Pub Date : 2003-08-26 DOI: 10.1109/TSA.2003.815820

N. Mitianoudis, M. Davies

引用次数: 153

Fast model selection based speaker adaptation for nonnative speech 基于快速模型选择的非母语语音说话人自适应

IEEE Trans. Speech Audio Process. Pub Date : 2003-07-28 DOI: 10.1109/TSA.2003.814379

Xiaodong He, Yunxin Zhao

{"title":"Fast model selection based speaker adaptation for nonnative speech","authors":"Xiaodong He, Yunxin Zhao","doi":"10.1109/TSA.2003.814379","DOIUrl":"https://doi.org/10.1109/TSA.2003.814379","url":null,"abstract":"The problem of adapting acoustic models of native English speech to nonnative speakers is addressed from a perspective of adaptive model complexity selection. The goal is to select model complexity dynamically for each nonnative talker so as to optimize the balance between model robustness to pronunciation variations and model detailedness for discrimination of speech sounds. A maximum expected likelihood (MEL) based technique is proposed to enable reliable complexity selection when adaptation data are sparse, where expectation of log-likelihood (EL) of adaptation data is computed based on distributions of mismatch biases between model and data, and model complexity is selected to maximize EL. The MEL based complexity selection is further combined with MLLR (maximum likelihood linear regression) to enable adaptation of both complexity and parameters of acoustic models. Experiments were performed on WSJ1 data of speakers with a wide range of foreign accents. Results show that the MEL based complexity selection is feasible when using as little as one adaptation utterance, and it is able to select dynamically the proper model complexity as the adaptation data increases. Compared with the standard MLLR, the MEL+MLLR method leads to consistent and significant improvement to recognition accuracy on nonnative speakers, without performance degradation on native speakers.","PeriodicalId":13155,"journal":{"name":"IEEE Trans. Speech Audio Process.","volume":"17 1 1","pages":"298-307"},"PeriodicalIF":0.0,"publicationDate":"2003-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82933228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

A new duration modeling approach for Mandarin speech 一种新的汉语语音时长的建模方法

IEEE Trans. Speech Audio Process. Pub Date : 2003-07-28 DOI: 10.1109/TSA.2003.814377

Sin-Horng Chen, Wen-Hsing Lai, Yih-Ru Wang

引用次数: 36

High-fidelity multichannel audio coding with Karhunen-Loeve transform 高保真多声道音频编码与Karhunen-Loeve变换

IEEE Trans. Speech Audio Process. Pub Date : 2003-07-28 DOI: 10.1109/TSA.2003.814375

Dai Yang, H. Ai, C. Kyriakakis, C.-C. Jay Kuo

引用次数: 34

Perceptual phase quantization of speech 语音的感知相位量化

IEEE Trans. Speech Audio Process. Pub Date : 2003-07-28 DOI: 10.1109/TSA.2003.814409

Doh-Suk Kim

引用次数: 23

A generalized subspace approach for enhancing speech corrupted by colored noise 一种增强有色噪声语音的广义子空间方法

IEEE Trans. Speech Audio Process. Pub Date : 2003-07-28 DOI: 10.1109/TSA.2003.814458

Y. Hu, P. Loizou

引用次数: 406

Joint filterbanks for echo cancellation and audio coding 联合滤波器组回声消除和音频编码

IEEE Trans. Speech Audio Process. Pub Date : 2003-07-28 DOI: 10.1109/TSA.2003.814798

P. Eneroth

引用次数: 11