2013 IEEE International Conference on Acoustics, Speech and Signal Processing最新文献_第6页

CRB analysis of near-field source localization using uniform circular arrays 均匀圆形阵列近场源定位的CRB分析

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638409

J. Delmas, H. Gazzah

引用次数: 19

Fast block-based algorithms for connected components labeling 基于快速块的连接组件标记算法

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638021

Diego J. C. Santiago, Ing Ren Tsang, George D. C. Cavalcanti, I. Tsang

引用次数: 4

Robustness of speech quality metrics to background noise and network degradations: Comparing ViSQOL, PESQ and POLQA 语音质量指标对背景噪声和网络退化的鲁棒性:比较ViSQOL、PESQ和POLQA

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638348

Andrew Hines, J. Skoglund, A. Kokaram, N. Harte

{"title":"Robustness of speech quality metrics to background noise and network degradations: Comparing ViSQOL, PESQ and POLQA","authors":"Andrew Hines, J. Skoglund, A. Kokaram, N. Harte","doi":"10.1109/ICASSP.2013.6638348","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638348","url":null,"abstract":"The Virtual Speech Quality Objective Listener (ViSQOL) is a new objective speech quality model. It is a signal based full reference metric that uses a spectro-temporal measure of similarity between a reference and a test speech signal. ViSQOL aims to predict the overall quality of experience for the end listener whether the cause of speech quality degradation is due to ambient noise, or transmission channel degradations. This paper describes the algorithm and tests the model using two speech corpora: NOIZEUS and E4. The NOIZEUS corpus contains speech under a variety of background noise types, speech enhancement methods, and SNR levels. The E4 corpus contains voice over IP degradations including packet loss, jitter and clock drift. The results are compared with the ITU-T objective models for speech quality: PESQ and POLQA. The behaviour of the metrics are also evaluated under simulated time warp conditions. The results show that for both datasets ViSQOL performed comparably with PESQ. POLQA was shown to have lower correlation with subjective scores than the other metrics for the NOIZEUS database.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125848680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

An integral operator based adaptive signal separation approach 一种基于积分算子的自适应信号分离方法

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638837

Xiyuan Hu, Silong Peng, W. Hwang

引用次数: 1

Enhanced inter-prediction using Merge Prediction Transformation in the HEVC codec 在HEVC编解码器中使用合并预测变换增强内部预测

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6637944

Saverio G. Blasi, Eduardo Peixoto, E. Izquierdo

引用次数: 8

Spectral envelope estimation used for audio bandwidth extension based on RBF neural network 基于RBF神经网络的频谱包络估计用于音频带宽扩展

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6637706

Haojie Liu, C. Bao, Xin Liu

引用次数: 8

Multisource DOA estimation based on time-frequency sparsity and joint inter-sensor data ratio with single acoustic vector sensor 基于时频稀疏度和单声矢量传感器联合数据比的多源DOA估计

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638412

Y. Zou, Wei Shi, Bo Li, C. Ritz, M. Shujau, J. Xi

{"title":"Multisource DOA estimation based on time-frequency sparsity and joint inter-sensor data ratio with single acoustic vector sensor","authors":"Y. Zou, Wei Shi, Bo Li, C. Ritz, M. Shujau, J. Xi","doi":"10.1109/ICASSP.2013.6638412","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638412","url":null,"abstract":"By exploring the time-frequency (TF) sparsity property of the speech, the inter-sensor data ratios (ISDRs) of single acoustic vector sensor (AVS) have been derived and investigated. Under noiseless condition, ISDRs have favorable properties, such as being independent of frequency, DOA related with single valuedness, and no constraints on near or far field conditions. With these observations, we further investigated the behavior of ISDRs under noisy conditions and proposed a so-called ISDR-DOA estimation algorithm, where high local SNR data extraction and bivariate kernel density estimation techniques have been adopted to cluster the ISDRs representing the DOA information. Compared with the traditional DOA estimation methods with a small microphone array, the proposed algorithm has the merits of smaller size, no spatial aliasing and less computational cost. Simulation studies show that the proposed method with a single AVS can estimate up to seven sources simultaneously with high accuracy when the SNR is larger than 15dB. In addition, the DOA estimation results based on recorded data further validates the proposed algorithm.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126601797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

Toward body language generation in dyadic interaction settings from interlocutor multimodal cues 基于对话者多模态线索的二元互动情境下肢体语言生成研究

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638361

Zhaojun Yang, A. Metallinou, Shrikanth S. Narayanan

引用次数: 5

Efficient algorithm for rational kernel evaluation in large lattice sets 大格集中有理核求值的高效算法

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638235

J. Svec, P. Ircing

引用次数: 6

On passive TDOA and FDOA localization using two sensors with no time or frequency synchronization 无时间和频率同步的双传感器无源TDOA和FDOA定位研究

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638423

A. Yeredor

引用次数: 23