2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献_第7页

Channel and sensing aware channel access policy for multi-channel cognitive radio networks 多信道认知无线网络的信道和感知信道接入策略

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288581

Shu-Hsien Wang, Chih-yu Hsu, Y. Hong

引用次数: 2

Joint spectral and temporal normalization of features for robust recognition of noisy and reverberated speech 联合频谱和时间归一化特征对噪声和混响语音的鲁棒识别

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288876

Xiong Xiao, Chng Eng Siong, Haizhou Li

{"title":"Joint spectral and temporal normalization of features for robust recognition of noisy and reverberated speech","authors":"Xiong Xiao, Chng Eng Siong, Haizhou Li","doi":"10.1109/ICASSP.2012.6288876","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288876","url":null,"abstract":"In this paper, we propose a framework for joint normalization of spectral and temporal statistics of speech features for robust speech recognition. Current feature normalization approaches normalize the spectral and temporal aspects of feature statistics separately to overcome noise and reverberation. As a result, the interaction between the spectral normalization (e.g. mean and variance normalization, MVN) and temporal normalization (e.g. temporal structure normalization, TSN) is ignored. We propose a joint spectral and temporal normalization (JSTN) framework to simultaneously normalize these two aspects of feature statistics. In JSTN, feature trajectories are filtered by linear filters and the filters' coefficients are optimized by maximizing a likelihood-based objective function. Experimental results on Aurora-5 benchmark task show that JSTN consistently out-performs the cascade of MVN and TSN on test data corrupted by both additive noise and reverberation, which validates our proposal. Specifically, JSTN reduces average word error rate by 8-9% relatively over the cascade of MVN and TSN for both artificial and real noisy data.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"26 1","pages":"4325-4328"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81593817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Bias analysis of source localization using the maximum likelihood estimator 使用极大似然估计器的源定位偏差分析

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288450

Liyang Rui, K. C. Ho

引用次数: 24

Expected-utility-based sensor selection for state estimation 基于期望效用的状态估计传感器选择

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288470

David M. Cohen, Douglas L. Jones, S. Narayanan

引用次数: 2

Model centroids for the simplification of Kernel Density estimators 简化核密度估计的模型质心

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6287989

Olivier Schwander, F. Nielsen

引用次数: 12

User recommendation with tensor factorization in social networks 基于张量分解的社交网络用户推荐

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288758

Zhenlei Yan, Jie Zhou

引用次数: 10

Adaptive kernel principal components tracking 自适应核主成分跟踪

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288276

Toshihisa Tanaka, Y. Washizawa, A. Kuh

引用次数: 8

Lagrangian multiplier optimization using correlations in residues 利用残数相关性的拉格朗日乘子优化

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288099

Zhenyu Liu, Dongsheng Wang, Junwei Zhou, T. Ikenaga

{"title":"Lagrangian multiplier optimization using correlations in residues","authors":"Zhenyu Liu, Dongsheng Wang, Junwei Zhou, T. Ikenaga","doi":"10.1109/ICASSP.2012.6288099","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288099","url":null,"abstract":"Rate distortion optimization (RDO) algorithm plays the vital role in the up to date hybrid video codec H.264/AVC. The RDO algorithm of H.264/AVC reference software is built up by assuming that the transformed residues are memoryless variables. However, our experiments reveal that, for some sequences, the strong temporal correlations exist in the prediction residues. This paper extends the Lagrangian optimization techniques by modeling the transformed residues as the first-order Markov source and calibrating the distortion model with the piecewise approximation function. The proposed algorithms adjust the Lagrangian multiplier dynamically to improve the overall coding quality. Comprehensive experiments testify that, as compared with the JM reference software, our optimizations can achieve up to 1.875dB coding gain. Moreover, our algorithms posses more robust coding performance and introduce less computational overhead than the Laplace distribution based methods. The inherent short process latency makes it possible to cooperate our algorithms with rate control operation. Last but not least, the proposed approach is also useful for the emerging standard, HEVC.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"2 1","pages":"1185-1188"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85233219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

GMM foreground segmentation processor based on address free pixel streams 基于地址自由像素流的GMM前景分割处理器

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288213

R. Yagi, Tomohito Kajimoto, T. Nishitani

引用次数: 3

A local intensity adaptive structural similarity index 一种局部强度自适应结构相似性指数

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288090

Zhengguo Li, Chuohao Yeo, Y. H. Tan, S. Rahardja

引用次数: 0