2013 IEEE International Conference on Acoustics, Speech and Signal Processing最新文献

筛选
英文 中文
CRB analysis of near-field source localization using uniform circular arrays 均匀圆形阵列近场源定位的CRB分析
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638409
J. Delmas, H. Gazzah
{"title":"CRB analysis of near-field source localization using uniform circular arrays","authors":"J. Delmas, H. Gazzah","doi":"10.1109/ICASSP.2013.6638409","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638409","url":null,"abstract":"This paper is devoted to the Cramer Rao bound (CRB) on the azimuth, elevation and range of a narrow-band near-field source localized by means of a uniform circular array (UCA), using the exact expression of the time delay parameter. After proving that the conditional and unconditional CRB are generally proportional for constant modulus steering vectors, we specify conditions of isotropy w.r.t. the distance and the number of sensors. Then we derive very simple, yet very accurate non-matrix closed-form expressions of different approximations of the CRBs.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125084108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Fast block-based algorithms for connected components labeling 基于快速块的连接组件标记算法
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638021
Diego J. C. Santiago, Ing Ren Tsang, George D. C. Cavalcanti, I. Tsang
{"title":"Fast block-based algorithms for connected components labeling","authors":"Diego J. C. Santiago, Ing Ren Tsang, George D. C. Cavalcanti, I. Tsang","doi":"10.1109/ICASSP.2013.6638021","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638021","url":null,"abstract":"Block-based algorithms are considered the fastest approach to label connected components in binary images. However, the existing algorithms are two-scan which would need more comparisons if they were used as one-and-a-half-scan algorithms. Here, we proposed a new mask that enables the design of a block-based one-and-a-half-scan algorithm without any extra comparison. Furthermore, three new efficient algorithms for connected components labeling are presented: a block-based two-scan, a pixel-based one-and-a-half-scan and a block-based one-and-a-half-scan. We conducted experiments using synthetic and realistic images to evaluate the performance of the proposed methods compared to the existing methods. The proposed block-based one-and-a-half-scan algorithm presents the best performance in the realistic images dataset composed of 1290 documents. Our block-based two-scan algorithm proved to be the fastest in the synthetic dataset, especially in low density images.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125163995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Robustness of speech quality metrics to background noise and network degradations: Comparing ViSQOL, PESQ and POLQA 语音质量指标对背景噪声和网络退化的鲁棒性:比较ViSQOL、PESQ和POLQA
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638348
Andrew Hines, J. Skoglund, A. Kokaram, N. Harte
{"title":"Robustness of speech quality metrics to background noise and network degradations: Comparing ViSQOL, PESQ and POLQA","authors":"Andrew Hines, J. Skoglund, A. Kokaram, N. Harte","doi":"10.1109/ICASSP.2013.6638348","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638348","url":null,"abstract":"The Virtual Speech Quality Objective Listener (ViSQOL) is a new objective speech quality model. It is a signal based full reference metric that uses a spectro-temporal measure of similarity between a reference and a test speech signal. ViSQOL aims to predict the overall quality of experience for the end listener whether the cause of speech quality degradation is due to ambient noise, or transmission channel degradations. This paper describes the algorithm and tests the model using two speech corpora: NOIZEUS and E4. The NOIZEUS corpus contains speech under a variety of background noise types, speech enhancement methods, and SNR levels. The E4 corpus contains voice over IP degradations including packet loss, jitter and clock drift. The results are compared with the ITU-T objective models for speech quality: PESQ and POLQA. The behaviour of the metrics are also evaluated under simulated time warp conditions. The results show that for both datasets ViSQOL performed comparably with PESQ. POLQA was shown to have lower correlation with subjective scores than the other metrics for the NOIZEUS database.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125848680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
An integral operator based adaptive signal separation approach 一种基于积分算子的自适应信号分离方法
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638837
Xiyuan Hu, Silong Peng, W. Hwang
{"title":"An integral operator based adaptive signal separation approach","authors":"Xiyuan Hu, Silong Peng, W. Hwang","doi":"10.1109/ICASSP.2013.6638837","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638837","url":null,"abstract":"The operator-based signal separation approach uses an adaptive operator to separate a signal into additive subcomponents. And different types of operator can depict different properties of a signal. In this paper, we define a new kind of integral operator which can be derived from the second kind of Fredholm integral equation. Then, we analyze the properties of the proposed integral operator and discuss its relation to the second condition of Intrinsic Mode Function (IMF). To demonstrate the robustness and efficacy of the proposed operator, we incorporate it into the Null Space Pursuit algorithm to separate several multicomponent signals, including a real-life signal.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126086018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Enhanced inter-prediction using Merge Prediction Transformation in the HEVC codec 在HEVC编解码器中使用合并预测变换增强内部预测
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6637944
Saverio G. Blasi, Eduardo Peixoto, E. Izquierdo
{"title":"Enhanced inter-prediction using Merge Prediction Transformation in the HEVC codec","authors":"Saverio G. Blasi, Eduardo Peixoto, E. Izquierdo","doi":"10.1109/ICASSP.2013.6637944","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6637944","url":null,"abstract":"Merge prediction is a novel technique introduced in the HEVC standard to improve inter-prediction exploiting redundancy of themotion information. We propose in this paper a new approach to enhance the Merge mode in a typical HEVC encoder using parametric transformations of the Merge prediction candidates. An Enhanced Inter-Prediction module is implemented in HEVC using Merge Prediction Transformation (MPT), integrated with the HEVC new features such as the large coding units (CU) and the recursive prediction unit partitioning. The MPT parameters are quantised according to the CU depth and the current QP. The optimal quantization steps are derived via statistical analysis as illustrated in the paper. Results show consistent improvements over conventional HEVC encoding in terms of rate-distortion performance, with a small impact on the encoding complexity and negligible impact on the decoding complexity.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125315712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Spectral envelope estimation used for audio bandwidth extension based on RBF neural network 基于RBF神经网络的频谱包络估计用于音频带宽扩展
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6637706
Haojie Liu, C. Bao, Xin Liu
{"title":"Spectral envelope estimation used for audio bandwidth extension based on RBF neural network","authors":"Haojie Liu, C. Bao, Xin Liu","doi":"10.1109/ICASSP.2013.6637706","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6637706","url":null,"abstract":"In this paper a new spectral envelope estimation method based on radial basis function (RBF) neural network is proposed for implementing a blind bandwidth extension method of audio signals. To make the sub-band envelope of high-frequency (HF) components accurately recovered, the RBF neural network is utilized to fit the relationship between low-frequency (LF) features and sub-band envelope of HF components. In addition, the fine structure of HF components which can guarantee the timber of the extended audio signal is reconstructed based on nonlinear dynamics. The objective and subjective test results indicate that the proposed method outperforms the reference methods.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125410558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Multisource DOA estimation based on time-frequency sparsity and joint inter-sensor data ratio with single acoustic vector sensor 基于时频稀疏度和单声矢量传感器联合数据比的多源DOA估计
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638412
Y. Zou, Wei Shi, Bo Li, C. Ritz, M. Shujau, J. Xi
{"title":"Multisource DOA estimation based on time-frequency sparsity and joint inter-sensor data ratio with single acoustic vector sensor","authors":"Y. Zou, Wei Shi, Bo Li, C. Ritz, M. Shujau, J. Xi","doi":"10.1109/ICASSP.2013.6638412","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638412","url":null,"abstract":"By exploring the time-frequency (TF) sparsity property of the speech, the inter-sensor data ratios (ISDRs) of single acoustic vector sensor (AVS) have been derived and investigated. Under noiseless condition, ISDRs have favorable properties, such as being independent of frequency, DOA related with single valuedness, and no constraints on near or far field conditions. With these observations, we further investigated the behavior of ISDRs under noisy conditions and proposed a so-called ISDR-DOA estimation algorithm, where high local SNR data extraction and bivariate kernel density estimation techniques have been adopted to cluster the ISDRs representing the DOA information. Compared with the traditional DOA estimation methods with a small microphone array, the proposed algorithm has the merits of smaller size, no spatial aliasing and less computational cost. Simulation studies show that the proposed method with a single AVS can estimate up to seven sources simultaneously with high accuracy when the SNR is larger than 15dB. In addition, the DOA estimation results based on recorded data further validates the proposed algorithm.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126601797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Toward body language generation in dyadic interaction settings from interlocutor multimodal cues 基于对话者多模态线索的二元互动情境下肢体语言生成研究
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638361
Zhaojun Yang, A. Metallinou, Shrikanth S. Narayanan
{"title":"Toward body language generation in dyadic interaction settings from interlocutor multimodal cues","authors":"Zhaojun Yang, A. Metallinou, Shrikanth S. Narayanan","doi":"10.1109/ICASSP.2013.6638361","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638361","url":null,"abstract":"During dyadic interactions, participants influence each other's verbal and nonverbal behaviors. In this paper, we examine the coordination between a dyad's body language behavior, such as body motion, posture and relative orientation, given the participants' communication goals, e.g., friendly or conflictive, in improvised interactions. We further describe a Gaussian Mixture Model (GMM) based statistical methodology for automatically generating body language of a listener from speech and gesture cues of a speaker. The experimental results show that automatically generated body language trajectories generally follow the trends of observed trajectories, especially for velocities of body and arms, and that the use of speech information improves prediction performance. These results suggest that there is a significant level of predictability of body language in the examined goal-driven improvisations, which could be exploited for interaction-driven and goal-driven body language generation.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126831196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Efficient algorithm for rational kernel evaluation in large lattice sets 大格集中有理核求值的高效算法
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638235
J. Svec, P. Ircing
{"title":"Efficient algorithm for rational kernel evaluation in large lattice sets","authors":"J. Svec, P. Ircing","doi":"10.1109/ICASSP.2013.6638235","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638235","url":null,"abstract":"This paper presents an effective method for evaluation of the rational kernels represented by finite-state automata. The described algorithm is optimized for processing speed and thus facilitates the usage of state-of-the-art machine learning techniques like Support Vector Machines even in the real-time application of speech and language processing, such as dialogue systems and speech retrieval engines. The performance of the devised algorithm was tested on a spoken language understanding task and the results suggest that it consistently outperforms the baseline algorithm presented in the related literature.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126847150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On passive TDOA and FDOA localization using two sensors with no time or frequency synchronization 无时间和频率同步的双传感器无源TDOA和FDOA定位研究
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638423
A. Yeredor
{"title":"On passive TDOA and FDOA localization using two sensors with no time or frequency synchronization","authors":"A. Yeredor","doi":"10.1109/ICASSP.2013.6638423","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638423","url":null,"abstract":"Traditional passive localization based on Time-Difference of Arrival (TDOA) or Frequency-Difference of Arrival (FDOA) usually involves several remote sensors, which require precise time-synchronization and frequency-locking among them. The need for such time or frequency alignment sometimes poses a serious operational challenge on the system. In addition, it is often desired to keep the number of sensors to a minimum. In this work we look into the operationally-simplest scenario in this context: using only two sensors, without any synchronization or locking. When at least one of the sensors, or the transmitting target, is moving at some considerable speed, it is still possible to localize the target, based on a few TDOA and / or FDOA measurements, by considering the time- and frequency-offsets as additional unknown parameters. We analyze the associated performance bound and propose a Maximum Likelihood estimation approach. The attainable accuracy and its dependence on geometry are demonstrated numerically and in simulation.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126927515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信