{"title":"Invertible deinterlacing with variable coefficients and its lifting implementation","authors":"T. Ishida, S. Muramatsu, H. Kikuchi, T. Kuge","doi":"10.1109/ICME.2003.1221277","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221277","url":null,"abstract":"Invertible deinterlacing with variable coefficients is proposed to suppress comb-tooth artifacts caused by field interleaving of interlaced scanning video. A vertical highpass filter is applied to detect moving artifacts around boundaries of moving objects. The coefficients of a deinterlacing filter are varied depending on the motion intensity so that the deinterlacing filter may be matched to the local characteristics of moving pictures. Note that the deinterlacing filter is motion-adaptive and is time/translation-varying, while the deinterlacing is still kept to be invertible. The deinterlacing filter performance and its contribution to intraframe-based video coding is evaluated. In addition, since the processing of motion detection and a part of deinterlacing filtering can be shared, their efficient implementation is derived in the form of lifting popular in wavelets.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122770059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ankur Mohan, R. Duraiswami, D. Zotkin, D. DeMenthon, L. Davis
{"title":"Using computer vision to generate customized spatial audio","authors":"Ankur Mohan, R. Duraiswami, D. Zotkin, D. DeMenthon, L. Davis","doi":"10.1109/ICME.2003.1221247","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221247","url":null,"abstract":"Creating high quality virtual spatial audio over headphones requires real-time head tracking, personalized head-related transfer functions (HRTFs) and customized room response models. While there are expensive solutions to address these issues based on costly head trackers, measured personalized HRTFs and room responses, these are not suitable for widespread or easy deployment and use. We report on the development of a system that uses computer vision to produce customizable models for both the HRTF and the room response, and to achieve head-tracking. The system uses relatively inexpensive cameras and widely available personal computers. Computer-vision based anthropometric measurements of the head, torso, and the external ears are used for HRTF customization. For low-frequency HRTF customization we employ a simple head-and-torso model developed recently [V. R. Algazi et al., 2002]. For high frequency customization we employ measured pinna characteristics as an index into a database of HRTFs [D. N. Zotkin et al., 2002]. For head tracking we employ an online implementation of the POSIT algorithm [D. DeMenthon and L. Davis, 1995] along with active markers to compute head pose in real-time. The system provides an enhanced virtual listening experience at low cost.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122983329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A programmable, high performance vector array unit used for real-time motion estimation","authors":"Nikolaos Bellas, Malcolm Dwyer","doi":"10.1109/ICME.2003.1220868","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220868","url":null,"abstract":"The MPEG-4 and H.263 video standards are enabling technologies for the proliferation of wireless multimedia applications in 3G systems. For video encoding, the motion estimation (ME) stage is typically the most demanding in terms of performance and bandwidth requirements, and is usually implemented through dedicated hardware, especially in systems with stringent power requirements. This approach, however, cannot exploit any algorithm advances on motion estimation algorithms, and requires major hardware re-design in case of modified specifications or standards. This paper describes the architecture of a programmable motion estimation unit that is used as part of a larger wireless video encoding system. An instruction set architecture (ISA) allows the development of various ME algorithms in software without the need to re-design portion of the chip.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"15 23","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114059190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christian Bachmeir, P. Tabery, Serdar Uzumcu, E. Steinbach
{"title":"A scalable virtual programmable real-time testbed for rapid multimedia service creation and evaluation","authors":"Christian Bachmeir, P. Tabery, Serdar Uzumcu, E. Steinbach","doi":"10.1109/ICME.2003.1221297","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221297","url":null,"abstract":"In this paper we describe a new flexible and universally applicable testbed for Internet-based multimedia applications. Our approach combines the technology of programmable networks with emulated network environments and delivers a virtual programmable testbed (VPT). In this work we show that a VPT is well suited for rapid creation and evaluation of distributed multimedia applications. There are three advantages of our approach compared to previously proposed testbed solutions: first, the software for processing IP datagrams, so called atomic modules, only depends on the programmable platform deployed, making it reusable. Second, loading several atomic modules concurrently allows the easy setup of customized test environments, e.g., channel models. Third, the application being developed can be tested under various network conditions (failures, congestion, etc.) at an early stage. Based on our VPT, distributed multimedia services can be created and evaluated both fast and economically in emulated environments without expensive and complex hardware setup.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114499326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Taal, I. Haratcherev, K. Langendoen, R. Lagendijk
{"title":"Quality of service controlled adaptive video-coding over IEEE 802.11 wireless links","authors":"J. Taal, I. Haratcherev, K. Langendoen, R. Lagendijk","doi":"10.1109/ICME.2003.1220886","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220886","url":null,"abstract":"In this paper we present the initial results of experiments where encoded video data is transmitted over an IEEE 802.11a wireless link. The changing link state and the perhaps changing constraints imposed by mobile device or user, moves the optimal settings of both the video encoder and the IEEE 802.11 network layer. The video encoder and the network layer negotiates their options facilitated by a quality of service negotiation scheme called \"adaptive resource contracts\", thereby jointly optimizing their parameters. When time is a constraint, this optimization has to be fast and simple, maybe even at the cost of finding suboptimal solutions, as long as the found settings gives a better quality of service than fixed parameter settings.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114602417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time sound localization using field-programmable gate arrays","authors":"Duy Cuong Nguyen, P. Aarabi, A. Sheikholeslami","doi":"10.1109/ICME.2003.1221745","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221745","url":null,"abstract":"This paper presents a single FPGA implementation of a real-time sound localization system using two microphones. The implementation, utilizing a cross-correlation technique based on a modified version of the phase transform, successfully localizes sound sources in noisy environments with as low an SNR as 10 dB. Using the same algorithm and similar hardware architecture, it is shown that up to 5 parallel systems (using 10 microphones), all real-time, can be implemented on a single FPGA while only utilizing an estimated 77 mW-108 mW per microphone.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129829370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An improved parallel architecture fro MPEG-4 motion estimation in 3G mobile applications","authors":"Donglai Xu, Rui Gao, H. Batatia","doi":"10.1109/ICME.2003.1221343","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221343","url":null,"abstract":"A high-parallel VLSI core architecture for MPEG-4 motion estimation is proposed in this paper. It possesses the characteristics of low memory bandwidth and low clock rate requirements, thus primarily aiming at 3G mobile applications. Based on a one-dimensional tree architecture, the architecture employs the dual-register/buffer technique to reduce the preload and alignment cycles. As an example, full-search block matching algorithm has been mapped onto this architecture using a 16-PE array that has the ability to calculate the motion vectors of QCIF video sequences in real time at 1 MHz clock rate and using 15.5 Mbytes/s memory bandwidth.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128773718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"8-bit partial sums of 16 luminance values for fast block motion estimation","authors":"C. Duanmu, M. Ahmad, M. Swamy","doi":"10.1109/ICME.2003.1221011","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221011","url":null,"abstract":"Fast block motion estimation algorithms are needed for real-time implementations of video coding standards due to the high computational complexity of the full-search algorithm for block motion estimation. In this paper, an algorithm using 8-bit partial sums of 16 luminance values for a fast block motion estimation is proposed. The technique of using the partial sums is employed to reduce the computational complexity of not only the full-search algorithm but also some of the fast block motion estimation algorithms while maintaining their accuracy. Furthermore, it is shown that the byte-type data-parallelism on an SIMD architecture can be utilized to access and process these partial sums concurrently to accelerate the process of motion estimation. Simulation results are presented to demonstrate that the use of the partial sums can accelerate the execution of the full-search, three-step search, and four-step search algorithms on an SIMD architecture significantly.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128566655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new congestion control algorithm for layered multicast in heterogeneous multimedia dissemination","authors":"Qiang Liu, Jenq-Neng Hwang","doi":"10.1109/ICME.2003.1221671","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221671","url":null,"abstract":"Layered multicast is a promising technique for disseminating adaptive-quality audio/video to multiple heterogeneous receivers. Congestion control in a layered multicast scheme is very important to support heterogeneity and scalability. Previous works on congestion control use packet loss, delay, and receiving rate to infer whether there is spare capacity along the path, which suffers from slow convergence, lack of inter-session fairness or TCP-fairness, layer oscillations, and loss induced by the join experiments. In this paper, we propose a layered multicast bandwidth inference congestion (BlC) control, which use delay increasing trend detection to infer the spare capacity. The major contribution of this paper is introducing the source probe organization and effective spare capacity inference by delay trend detection algorithm. We evaluate BIC for a large variety of scenarios and show that it converges fast to the optimal link utilization and can adapt to network dynamics effectively. We also show that BIC is stable, inter-session fair and fair to competing TCP traffics.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129009932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The security flaws in some authentication watermarking schemes","authors":"Yongdong Wu, F. Bao, Changsheng Xu","doi":"10.1109/ICME.2003.1221661","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221661","url":null,"abstract":"Watermarking technology was originally proposed for copyright protection. Recently it has been applied to media authentication so that a proof of authenticity is inserted into the media instead of being appended to the media as a separated attachment. However, security requirements of the authentication are overlooked in some authentication watermark schemes. In this paper we analyze three authentication watermarking schemes and point out their security flaws. The first scheme is the color authentication scheme in [S.C. Byun et al., 2002]. The scheme is not secure in the sense that as long as an attacker obtains one authenticated image, he is able to forge authentic images without the secret key. The second scheme [Ping Wah Wing, et al., 2001] is an authentication scheme but it is extended for ownership incorrectly. The third one, the robust invertible watermarking scheme [J. Friedrich et al., 2002], employs a multiple of secret random sequences to produce a watermark. However these sequences are independent of the original images, i.e., they remain invariable for different images. An adversary, having sufficient number of original images, can reconstruct the secret sequences by solving simultaneous equations. With these reconstructed sequences, the attacker can forge authentic image freely. The attack can be thwarted with content related sequences generated from both the secret key and the original image.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124768486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}