2007 IEEE 9th Workshop on Multimedia Signal Processing最新文献

筛选
英文 中文
Spatial and Temporal Adaptation of Interpolation Filter For Low Complexity Encoding/Decoding 低复杂度编码/解码中插值滤波器的时空自适应
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412843
D. Rusanovskyy, M. Gabbouj, K. Ugur
{"title":"Spatial and Temporal Adaptation of Interpolation Filter For Low Complexity Encoding/Decoding","authors":"D. Rusanovskyy, M. Gabbouj, K. Ugur","doi":"10.1109/MMSP.2007.4412843","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412843","url":null,"abstract":"Compared to video coding with non-adaptive interpolation filtering, adaptive filters achieve higher compression ratios, with an increase in encoding and decoding complexity. In our earlier work, we significantly reduced the decoding complexities of adaptive filtering schemes with a minimal impact on the coding efficiency by making use of different filters and adapting them spatially and temporally. However, our previous scheme required high encoder complexity, as several encoding passes per frame were needed to analyze the input image and optimize the selection of interpolation filters. In this paper, a novel algorithm that does not require multiple encoding passes, but still give similar or better performance is proposed. This is achieved by using a modified decision making function that does not require full reconstruction of coded frame and use motion and prediction information more efficiently. In addition, we generalized our previous scheme by introducing additional filters, so that better Rate-Distortion-Complexity tradeoffs are possible. Experimental results show that up-to 50-70% reduction in interpolation complexity is achieved, with less than 0.13 dB penalty on coding efficiency.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125437421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Perceptual Enhancement for Fully Scalable Audio 完全可扩展音频的感知增强
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412811
Te Li, S. Rahardja, S. Koh
{"title":"Perceptual Enhancement for Fully Scalable Audio","authors":"Te Li, S. Rahardja, S. Koh","doi":"10.1109/MMSP.2007.4412811","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412811","url":null,"abstract":"MPEG-4 scalable lossless (SLS) coding is the latest released ISO international standard for scalable audio coding. Besides its function as an extension of MPEG-4 advanced audio coding (AAC) perceptual audio coder, SLS has a \"non-core mode\" that is able to offer full scalability. The perceptual audio coder is absent in this mode and scalability is achieved through pure bit-plane coding. In this paper, a perceptually enhanced bit-plane coding method, namely Quad-level bit-plane coding (QBPC) is proposed to enhance the perceptual quality of fully scalable audio at intermediate bitrates. With QBPC structure, the perceptual quality of fully scalable audio coded by SLS is significantly improved in a wide range of intermediate bitrates. Meanwhile this is achieved with trivial added overhead and complexity.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122646371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Impact of Additional Noise on Subjective and Objective Quality Assessement in VoIP 附加噪声对VoIP主客观质量评价的影响
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412813
Zdenek Becvar, L. Novák, J. Zelenka, M. Brada, P. Slepička
{"title":"Impact of Additional Noise on Subjective and Objective Quality Assessement in VoIP","authors":"Zdenek Becvar, L. Novák, J. Zelenka, M. Brada, P. Slepička","doi":"10.1109/MMSP.2007.4412813","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412813","url":null,"abstract":"The main requirement in the Voice over IP technology is a good quality of received voice signal during communication between subscribers. The signal quality can be influenced by many factors such as packet loss, jitter, packet delay, noise etc. and it can be measured by number of methods. The main purpose of this paper is the investigation of an impact of different noise types and different noise levels on the quality assessment in VoIP. The artificial generated noises and real noises obtained from real telecommunications networks were used for testing. The next goal is a comparison of the results obtained by subjective listening tests and objective measuring methods. PESQ and 3SQM were used for objective testing in this paper.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121435180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Flexible Video Decoding: A Distributed Source Coding Approach 灵活的视频解码:一种分布式源编码方法
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412828
Ngai-Man Cheung, Antonio Ortega
{"title":"Flexible Video Decoding: A Distributed Source Coding Approach","authors":"Ngai-Man Cheung, Antonio Ortega","doi":"10.1109/MMSP.2007.4412828","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412828","url":null,"abstract":"We investigate video compression techniques to address problems that require flexible video decoding. In these, the encoder has access to a number of candidate predictors that allow it to exploit source signal correlation, but only a subset of these predictors will be available at the decoder. Crucially, the encoder does not know which predictors will be available. Flexible decoding is important in a number of applications including frame-by-frame forward and backward video playback, multiview video, bitstreams switching, robust video transmission, etc. The main challenge to support flexible decoding is that the encoder needs to compress a current frame under the uncertainty on the predictor at decoder. An approach based on conventional \"closed loop\" prediction, e.g., motion-compensated predictive (MCP) coding in the case of video, could be developed by including multiple possible prediction residues in the bitstream, but this would lead to a considerable coding performance penalty, if all possible predictor combinations are supported, or to drifting, if only some combinations are. Moreover, it is not possible in general to guarantee that decoded versions under different prediction scenarios will be identical. In this paper, we propose a distributed source coding (DSC) based algorithm to tackle the problem. The main novelties of the proposed algorithm are that it incorporates different macroblock modes and significance coding within the DSC framework. This, combined with a judicious exploitation of correlation statistics, allows us to achieve competitive coding performance. Using forward/backward video playback as an example, we demonstrate the proposed algorithm can outperform a solution based on MCP coding.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132518502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Image alignment with rotation manifolds built on sparse geometric expansions 基于稀疏几何展开的旋转流形图像对齐
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412850
E. Kokiopoulou, P. Frossard
{"title":"Image alignment with rotation manifolds built on sparse geometric expansions","authors":"E. Kokiopoulou, P. Frossard","doi":"10.1109/MMSP.2007.4412850","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412850","url":null,"abstract":"In this paper we discuss the problem of alignment of patterns under arbitrary rotation. When a generic image pattern is geometrically transformed, it typically spans a (possibly nonlinear) manifold in a high dimensional space. When the pattern of interest is given by a sparse approximation over a structured dictionary of geometric atoms, we show that the rotation manifold can be expressed analytically as a function of the transformation parameters. At the same time, its high order derivatives are also given in a closed form when the pattern is represented as a sparse linear combination of a few differentiable basis functions. In this framework, the alignment problem is formulated as the minimization of the distance between the reference pattern and the manifold, which boils down to a nonlinear least squares optimization problem. We propose to solve this problem by a Newton-type method, whose solution is facilitated by the analytical expressions of the manifold derivatives. We further derive a global optimization heuristic algorithm based on Newton, and provide sufficient conditions for computing the global minimizer. Experimental results demonstrate the effectiveness of the proposed methodology for image alignment and rotation invariant pattern recognition.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134457731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dynamic FEC-Distortion Optimization for H.264 Scalable Video Streaming H.264可扩展视频流的动态fec失真优化
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412839
Wei-Chung Wen, Hsu-Feng Hsiao, Jen-Yu Yu
{"title":"Dynamic FEC-Distortion Optimization for H.264 Scalable Video Streaming","authors":"Wei-Chung Wen, Hsu-Feng Hsiao, Jen-Yu Yu","doi":"10.1109/MMSP.2007.4412839","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412839","url":null,"abstract":"Forward error correction codes have been shown to be a feasible solution either in application layer or in link layer to fulfill the need of quality of service for multimedia streaming over the fluctuant channels. In this paper, we propose FEC-distortion optimization algorithms to efficiently utilize the bandwidth for better video quality. The optimization criterions are based on the unequal error protection by taking account of the error drifting problems from both temporal motion compensation and inter-layer prediction of H.264/MPEG-4 AVC scalable video coding. Also, it can adapt to the content-dependent quality contribution of each video frame in a video layer. Lightweight error-concealment is also incorporated with the proposed algorithms for better H.264 SVC streaming. For some applications where either computation might be the bottleneck or the upper bound of non-decodable probability of each video layer is specified, alternative bandwidth allocation algorithm is provided with the trade-off of slight quality degradation.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133228726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Analyzing the Multimodal Behaviors of Users of a Speech-to-Speech Translation Device by using Concept Matching Scores 用概念匹配分数分析语音翻译设备用户的多模态行为
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412867
Jongho Shin, P. Georgiou, Shrikanth S. Narayanan
{"title":"Analyzing the Multimodal Behaviors of Users of a Speech-to-Speech Translation Device by using Concept Matching Scores","authors":"Jongho Shin, P. Georgiou, Shrikanth S. Narayanan","doi":"10.1109/MMSP.2007.4412867","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412867","url":null,"abstract":"We investigate factors related to interfacing a speech-to-speech translation device with multimodal capabilities. We evaluate the efficacy of the interactions using a measure for meaning transfer, we call concept score. We show that employing a multimodal interface improves translation quality, in this study, by 24%. We also show that while some users require perfect representation of what they said in order to allow transfer, others accept concept degradation to some extent, in median up to 20% in our experiments. An appropriate system strategy is required to recognize this behavior and guide users towards optimum performance points. For example, we show that appropriate feedback is required to guide the users in their choices of translation method, as 13% of the choices users made are worse than the alternatives the system provided.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116435297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Sensor Analysis of Sitar Performance: Where is the Beat? 锡塔琴性能的多模态传感器分析:节拍在哪里?
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412821
M. S. Benning, A. Kapur, B. Till, G. Tzanetakis
{"title":"Multimodal Sensor Analysis of Sitar Performance: Where is the Beat?","authors":"M. S. Benning, A. Kapur, B. Till, G. Tzanetakis","doi":"10.1109/MMSP.2007.4412821","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412821","url":null,"abstract":"In this paper we describe a system for detecting the tempo of sitar performance using a multimodal signal processing approach. Real-time measurements are obtained from sensors on the instrument and by wearable sensors on the performer's body. Experiments comparing audio-based and sensor-based tempo tracking are described. The real-time tempo tracking method is based on extracting onsets and applying Kalman filtering. We show how late fusion of the audio and sensor tempo estimates can improve tracking. The obtained results are used to inform design parameters for a real-time system for human-robot musical performance.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124309496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Multiple description image coding with redundant expansions and optimal quantization 具有冗余展开和最优量化的多描述图像编码
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412844
Ivana Radulovic, P. Frossard
{"title":"Multiple description image coding with redundant expansions and optimal quantization","authors":"Ivana Radulovic, P. Frossard","doi":"10.1109/MMSP.2007.4412844","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412844","url":null,"abstract":"This paper addresses the problem of optimal rate allocation for multiple description coding with redundant signal expansions. In case of redundant descriptions, the quantization of the transform coefficients has clearly to be adapted to the importance of the basis functions, to the redundancy in the representation, and to the expected loss probability on the transmission channel. We derive a rate-distortion optimal solution for the scalar quantization of coefficients in redundant signal representations. The application of the optimal rate allocation to a typical image communication problem demonstrates performance gains with respect to scheme based on uniform quantization with fixed step size, and to solutions based on unequal error protection.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125299282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Rate-Distortion Optimized I-Slice Selection for Low Delay Video Transmission 低延迟视频传输的率失真优化i片选择
2007 IEEE 9th Workshop on Multimedia Signal Processing Pub Date : 2007-10-01 DOI: 10.1109/MMSP.2007.4412831
Yuan Lin, A. N. Kim, Eren Gürses, A. Perkis
{"title":"Rate-Distortion Optimized I-Slice Selection for Low Delay Video Transmission","authors":"Yuan Lin, A. N. Kim, Eren Gürses, A. Perkis","doi":"10.1109/MMSP.2007.4412831","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412831","url":null,"abstract":"Rate smoothing is essential for achieving lower delay when transmitting real-time video over the network. Recently, \"explicit slice-based mode selection\" (ESM) is proposed as a new way of achieving this goal together with its inherent quality smoothness and error resilience features. However previous studies focus on the practical aspects and do not address an optimized solution. In this paper, we propose a rate-distortion (RD) optimized solution for finding the best location and size of the intra-coded slices. The experimental results show that for a target bit rate the optimized scheme is able to offer performance close to that of mode selection on a macroblock level, over wireless channels with different packet loss rates. Moreover, the optimized ESM algorithm provides significant advantages of granular bit stream prioritization for network transmission. However, the RD based optimization is in general computationally expensive. We therefore propose a heuristic approach which incorporates both channel statistics and sequence characteristics. Results show that it yields close to optimal performance at lower complexity.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116687169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信