2010 IEEE International Conference on Acoustics, Speech and Signal Processing最新文献

筛选
英文 中文
Learning from high-dimensional noisy data via projections onto multi-dimensional ellipsoids 通过在多维椭球体上的投影从高维噪声数据中学习
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495284
Liuling Gong, D. Schonfeld
{"title":"Learning from high-dimensional noisy data via projections onto multi-dimensional ellipsoids","authors":"Liuling Gong, D. Schonfeld","doi":"10.1109/ICASSP.2010.5495284","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495284","url":null,"abstract":"In this paper, we examine the problem of learning from noise-contaminated data in high-dimensional space. A new learning approach based on projections onto multi-dimensional ellipsoids (POME) is introduced, which is applicable to unsupervised clustering, semi-supervised clustering and classification in high-dimensional noisy data. Unlike the traditional learning techniques, where local information is used for data analysis, the proposed POME-based scheme incorporates a priori information of the data distribution. Experimental results in unsupervised clustering demonstrate the superiority of the proposed POME-based scheme to some well-known clustering algorithms, including the k-means and the hierarchical agglomerative clustering. We also illustrate the effectiveness of our proposed POME-based scheme in semi-supervised learning by simulation.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121310299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rate-distortion performance analysis of an analog motion estimation array 一种模拟运动估计阵列的速率失真性能分析
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495511
L. Koskinen, J. Poikonen, M. Laiho, A. Paasio
{"title":"Rate-distortion performance analysis of an analog motion estimation array","authors":"L. Koskinen, J. Poikonen, M. Laiho, A. Paasio","doi":"10.1109/ICASSP.2010.5495511","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495511","url":null,"abstract":"Emerging 3D-integration enables integrating high quality image sensors with various massively parallel processing elements. Analog motion estimation is one potential application, which is likely to result in significant benefits in the form of low power or high frame-rate 3D-integrated image sensor-processors. The system-level operation of a proposed analog motion estimation array, enabling all various block sizes from 4×4 to 16×16 is examined. The analog motion estimation circuitry has been designed as a 32×32 test array in 0.13 µm CMOS technology. The transistor-level simulation results combined with H.264/AVC JM 14.2 show equivalent rate-distortion results with SAD as the error measure and an approximately 7% increase in bitrate with a slight increase in image quality for SSE.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114363484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voice activity detection using harmonic frequency components in likelihood ratio test 似然比检验中谐波频率分量的语音活动检测
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495611
L. Tan, B. J. Borgstrom, A. Alwan
{"title":"Voice activity detection using harmonic frequency components in likelihood ratio test","authors":"L. Tan, B. J. Borgstrom, A. Alwan","doi":"10.1109/ICASSP.2010.5495611","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495611","url":null,"abstract":"This paper proposes a new statistical model-based likelihood ratio test (LRT) VAD to obtain reliable speech / non-speech decisions. In the proposed method, the likelihood ratio (LR) is calculated differently for voiced frames, as opposed to unvoiced frames: only DFT bins containing harmonic spectral peaks are selected for LR computation. To evaluate the new VAD's effectiveness in improving the noise-robustness of ASR, its decisions are applied to pre-processing techniques such as non-linear spectral subtraction, minimum mean square error short-time spectral amplitude estimator, and frame dropping. From the ASR experiments conducted on the Aurora2 database, the proposed harmonic frequency-based LRTs give better results than conventional LRT-based VADs and the standard G.729B and ETSI AMR VADs.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116329839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Using an exponential power model forwyner ziv video coding 利用指数幂模型进行视频编码
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5496065
Thomas Maugey, Jérôme Gauthier, B. Pesquet-Popescu, C. Guillemot
{"title":"Using an exponential power model forwyner ziv video coding","authors":"Thomas Maugey, Jérôme Gauthier, B. Pesquet-Popescu, C. Guillemot","doi":"10.1109/ICASSP.2010.5496065","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5496065","url":null,"abstract":"The Laplacian model is the standard distribution for correlation noise estimation at the turbodecoder in Wyner-Ziv coding schemes. In practice, this hypothesis is not always satisfied and, regularly, the estimated model sensibly differs from the error distribution. In this work, we prove that using a model better fitted to the true distribution improves the performances, and we thus propose to use the more general exponential power distribution (EPD) which has never been tested in a distributed video coding context. Gains in rate-distortion over the Laplacian model are illustrated by results on several video sequences, showing that the EPD model outperforms the Laplacian one in off-line (oracle) as well as in on-line (practical implementation) modes. These results also indicate that, in some cases, the online EPD model reduces the bitrate even over the off-line Laplacian model.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"147 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121498488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
An efficient method to generate ground truth for evaluating lane detection systems 车道检测系统的一种有效的地面真值生成方法
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495346
Amol Borkar, M. Hayes, Mark T. Smith
{"title":"An efficient method to generate ground truth for evaluating lane detection systems","authors":"Amol Borkar, M. Hayes, Mark T. Smith","doi":"10.1109/ICASSP.2010.5495346","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495346","url":null,"abstract":"In this document, a new and efficient method to specify the ground truth locations of lane markers is presented. The method comprises of a novel process called Time-Slicing that provided the user with a unique visualization of the video. Coupled with automation via spline interpolation, the quick generation of necessary ground truth information is achieved. Videos recorded from a vehicle while driving on local city roads and highways are marked with ground truth information for use in testing. The performance of a variety of lane detection systems is compared to the ground truth and the error is computed for each system. Finally, quantitative analysis shows that the reference lane detection system presented in [1] produces the most accurate lane detections which is depicted by the smallest error.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121515152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
A new mode selection technique for coding Depth maps of 3D video 一种新的3D视频深度图编码模式选择技术
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495093
D. V. S. X. D. Silva, W. Fernando, H. K. Arachchi
{"title":"A new mode selection technique for coding Depth maps of 3D video","authors":"D. V. S. X. D. Silva, W. Fernando, H. K. Arachchi","doi":"10.1109/ICASSP.2010.5495093","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495093","url":null,"abstract":"Compression of Depth maps that are used in 3D video systems based on Depth Image Based Rendering (DIBR) poses a new challenge in video coding, since it is not a sequence of images for final viewing by end users rather an aid for rendering. Therefore, compressing depth maps using existing video coding techniques yields unacceptable distortions while rendering virtual views. In this paper we propose a novel mode selection method for offline compression of depth maps by selecting modes collaboratively considering an entire row of macroblocks together. For selecting these modes while encoding, we propose a novel distortion criteria that incorporates rendering distortions instead of distortion of depth map itself. A genetic algorithm based optimization technique is used for the mode selection. The simulation results suggest that the proposed technique can improve the PSNR up to 1.6dB in the rendered stereoscopic views in comparison to the block wise mode selection method based on Lagrange Optimization and the distortion of the depth map itself.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121548379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
A signal-specific bound for joint tdoa and FDOA estimation and its Use in combining multiple segments tdoa和FDOA联合估计的信号特定界及其在多段组合中的应用
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495820
A. Yeredor
{"title":"A signal-specific bound for joint tdoa and FDOA estimation and its Use in combining multiple segments","authors":"A. Yeredor","doi":"10.1109/ICASSP.2010.5495820","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495820","url":null,"abstract":"We consider passive joint estimation of the time-difference of arrival (TDOA) and frequency-difference of arrival (FDOA) of an unknown signal at two sensors. The classical approach for deriving the Cramér-Rao bound (CRB) in this context assumes that the signal (as well as the noise) is Gaussian and stationary. As a result, the obtained Fisher information matrix with respect to the TDOA and FDOA is diagonal, implying that the respective estimation errors are uncorrelated (under asymptotic conditions). However, for some specific (non-Gaussian, non-stationary) signals, especially chirp-like signals, these errors can be strongly correlated. In this work we derive a “signal-specific” (or a “conditional”) CRB for this problem: Modeling the signal as a deterministic unknown, we obtain a bound which, given any particular signal, can reflect the possible signal-induced correlation between the TDOA and FDOA estimates. We further demonstrate that this bound is instrumental for proper weighting when combining joint TDOA and FDOA estimates from independent intervals.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114707142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Randomized incremental protocols over adaptive networks 自适应网络上的随机增量协议
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495951
C. G. Lopes, A. H. Sayed
{"title":"Randomized incremental protocols over adaptive networks","authors":"C. G. Lopes, A. H. Sayed","doi":"10.1109/ICASSP.2010.5495951","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495951","url":null,"abstract":"We introduce an incremental cooperation mode into the framework of adaptive networks (AN). The method applies to generic topologies and avoids the need to establish a Hamiltonian cycle over the network, generalizing the original incremental mode, while keeping nearly the same mean-square performance, as illustrated by the simulations. We motivate the new mode by relying on an LMS rule at the nodes, and mean-square analysis is provided.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114761062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
A dual perspective on separable semidefinite programming with applications to optimal beamforming 可分离半定规划在波束形成优化中的双重应用
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5496110
Yongwei Huang, D. Palomar
{"title":"A dual perspective on separable semidefinite programming with applications to optimal beamforming","authors":"Yongwei Huang, D. Palomar","doi":"10.1109/ICASSP.2010.5496110","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5496110","url":null,"abstract":"Consider the downlink beamforming optimization problem with signal-to-interference-plus-noise ratio constraints, null-shaping interference constraints and multiple groups of individual shaping constraints. We propose an efficient algorithm for the problem, which consists of firstly solving the dual of the semidefinite programm (SDP) relaxation, secondly formulating a linear program (LP) and solving it to find a rank-one solution of the SDP relaxation. In contrast to the existing algorithms, the analysis of the proposed algorithm includes neither the rank reduction steps (purification process) nor the Perron-Frobenius theorem.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114819004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Temporal motion smoothness measurement for reduced-reference video quality assessment 用于减少参考视频质量评估的时间运动平滑度测量
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495316
Kai Zeng, Zhou Wang
{"title":"Temporal motion smoothness measurement for reduced-reference video quality assessment","authors":"Kai Zeng, Zhou Wang","doi":"10.1109/ICASSP.2010.5495316","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495316","url":null,"abstract":"Reduced-reference (RR) video quality measures aim to predict the perceptual quality of distorted video signals using only partial information about the reference video. Existing RR video quality assessment models are mostly designed and/or trained for specific applications such as lossy compression, where the detectable distortion types are often fixed and limited. Here we propose a novel approach that measures temporal motion smoothness of a video sequence by examining the temporal variations of local phase structures in the complex wavelet transform domain. We show that the proposed measure can detect a wide range of well-known practical distortions, including noise contamination, blurring, line or frame jittering, and frame dropping. In addition, the proposed algorithm does not require a costly motion estimation process and has a low RR data rate, making it much easier to be adopted in real-world visual communication applications.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114853171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信