{"title":"Single-pass distortion-smoothing encoding for low bit-rate video streaming applications","authors":"Tao Chen, Zhihai He","doi":"10.1109/ICME.2003.1221597","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221597","url":null,"abstract":"This paper proposes a rate control scheme for smoothing distortion in low bit-rate video coding. We focus on single-pass encoding that is applicable to both live and off-line streaming. The new rate control tends to achieve slow and smooth distortion variation over time. Without the knowledge of future frames, statistics of previously coded frames are used to derive the expected distortion for the current frame. Constraints of decoder buffer size and pre-loading time are considered in the design. The proposed technique is based on the rate and distortion models used in the TMN8. Its advantages have been shown by experiments.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131053270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised object-based sprite coding system for tennis sport","authors":"Ching-Yeh Chen, Shao-Yi Chien, Yi-Hau Chen, Yu-Wen Huang, Liang-Gee Chen","doi":"10.1109/ICME.2003.1220923","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220923","url":null,"abstract":"Sprite coding is a new objected-based coding technology proposed by MPEG-4 video standard. In this paper, we propose an unsupervised sprite coding system for sport videos, for example, tennis sequences. Our system can provide several important functions. First, the sprite of the background can be generated without any pre-processing masks in our system. Second, it can automatically segment the foreground and background in a video sequence. Third, it can provide the masks of the foreground objects and the tennis ball. The experimental results show that our system has a very good performance and the coding gain of our system compared with MPEG-4 advanced simple profile is 2.5 dB at the low bit rate (270 Kbps) and is 2 dB at the ultra low bit rate (70 Kbps). It can be used in the sport video coding at low bit rate and provides a object-based sport video sequence.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122362120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yao-Jen Chang, C. Hsieh, Pei-Wei Hsu, Yung-Chang Chen
{"title":"Speech-assisted facial expression analysis and synthesis for virtual conferencing systems","authors":"Yao-Jen Chang, C. Hsieh, Pei-Wei Hsu, Yung-Chang Chen","doi":"10.1109/ICME.2003.1221365","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221365","url":null,"abstract":"Fast, reliable, and marker-free facial expression analysis still remains to be a difficult task in computer vision research. In this paper, the concept of speech-assisted facial expression analysis and synthesis is proposed, which shows that the speech-driven facial animation technique not only can be used for expression analysis. From the input speech, the mouth shape can is estimated from the audio-visual model. Thus, the large search space of mouth appearance is reduced for mouth tracking. Similarly, the modeling technique is extended from modeling speech and mouth shape to facial movements and detail facial texture changes. In this way, a virtual conferencing system with video realistic avatars is realized to meet real-time requirement.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130694445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint resolution enhancement and artifact reduction for MPEG-2 encoded digital video","authors":"Yibin Yang, L. Böröczky","doi":"10.1109/ICME.2003.1221601","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221601","url":null,"abstract":"A joint approach of resolution enhancement and artifact reduction for MPEG-2 encoded video is proposed in this paper. This new method uses a unified metric for digital video processing (UMDVP). The UMDVP is defined based on the coding information of MPEG-2 encoded video. It also takes the local scene content into account. Experimental results have demonstrated that the joint approach using the UMDVP outperforms the system of resolution enhancement and artifact reduction without the help of UMDVP. The proposed method can improve the performance of various emerging video systems using MPEG-2 compression.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130697435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Shih, Liang-Chen Lu, Ying-Hong Wang, Rong-Chi Chang
{"title":"Multi-resolution image inpainting","authors":"T. Shih, Liang-Chen Lu, Ying-Hong Wang, Rong-Chi Chang","doi":"10.1109/ICME.2003.1220960","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220960","url":null,"abstract":"Digital inpainting is an image interpolation mechanism, which can automatically restore damaged or partially removed image. Most inpainting mechanisms use a singular resolution approach on the extrapolation or interpolation of pixels. We propose a multi-resolution algorithm, which can take into consideration the different levels of details. The algorithm was tested on 1000 still images, with an evaluation showing the effectiveness of our approach. The demonstration of our work is available at: http://www.mine.tku.edu.tw/demos/inpaint.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132019536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-parametric approach to ICA using kernel density estimation","authors":"K. Sengupta, P. Burman","doi":"10.1109/ICME.2003.1221026","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221026","url":null,"abstract":"Independent component analysis (ICA) has found a wide range of applications in signal processing and multimedia, ranging from speech cleaning to face recognition. This paper presents a non-parametric approach to the ICA problem that is robust towards outlier effects. The algorithm, for the first time in the field of ICA, adopts an intuitive and direct approach, focusing on the very definition of independence itself; i.e. the joint probability density function (pdf) of independent sources is factorial over the marginal distributions. This is contrary to traditional independent component analysis (ICA) algorithms, which achieve the objective by attempting to fulfill necessary conditions (but not sufficient) for independence. For example, the Jade algorithm attempts to approximate independence by minimizing higher order statistics. In the proposed algorithm, kernel density estimation is employed to provide a good approximation of the distributions that are required to be estimated. This estimation technique is inherently robust towards outlier effects. The application of kernel density estimation also enables the algorithm to be free from the assumptions of source distributions. Experimental results show that the algorithm is able to perform separation of sources in the presence of outliers, whereas existing algorithms like Jade and Info max break down under such conditions. The results have also shown that the proposed non-parametric approach is generally source distribution independent. In addition, it is able to separate non-Gaussian zero-kurtotic signals unlike the traditional ICA algorithms like Jade and Infomax.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"82 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132153637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A paper-based interface for video browsing and retrieval","authors":"J. Graham, J. Hull","doi":"10.1109/ICME.2003.1221725","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221725","url":null,"abstract":"A paper-based interface for browsing video is proposed. A paper document shows key frames selected from a video, a transcript for the parallel audio track, and bar codes that, when scanned, invoke a multimedia player. The paper document provides a stand-alone representation for a video recording that lets a user both understand the content of the file and replay only selected parts of the multimedia that are necessary to gain a better understanding. This approach applies the two-dimensional display characteristics of a newspaper to multimedia retrieval. By so doing, the user's browsing and search efficiency is greatly improved. This poster describes an implementation of the video paper system using a pocket PC with a bar code reader as the remote control device and an archive of video recordings on the pocket PC or an external server.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127866423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual attention based image browsing on mobile devices","authors":"X. Fan, Xing Xie, Wei-Ying Ma, HongJiang Zhang, He-Qin Zhou","doi":"10.1109/ICME.2003.1220852","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220852","url":null,"abstract":"Images have become more and more common in mobile communications. People now can easily take and exchange pictures on the move using their mobile devices and digital cameras. However, a crucial challenge is to provide a better user experience for browsing large images on limited and heterogeneous screen sizes of mobile devices. In this paper, we propose a novel image viewing technique based on an adaptive attention shifting model. A presentation technique named rapid serial visual presentation (RSVP), borrowed from the UI community, is used to simulate the attention shifting process. We show a prototype image viewer developed for pocket PC and conduct some evaluations to demonstrate the effectiveness of our approach.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127877802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Gavrilescu, A. Munteanu, P. Schelkens, J. Cornelis
{"title":"Embedded multiple description scalar quantizers for progressive image transmission","authors":"A. Gavrilescu, A. Munteanu, P. Schelkens, J. Cornelis","doi":"10.1109/ICME.2003.1220970","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220970","url":null,"abstract":"Robust progressive image transmission over unreliable channels with variable bandwidth requires multiple description coding (MDC) systems that produce highly error-resilient embedded bit-streams. The proposed embedded multiple description scalar quantizers (EMDSQ) meet the desired features consisting of a high redundancy level, fine grain rate adaptation and progressive transmission of each description. Experimental results show that EMDSQ yield better rate-distortion performance in comparison to the multiple description uniform scalar quantizers (MDUSQ) previously proposed in the literature. Moreover, the generalized form of EMDSQ targeting an arbitrary number of channels is proposed, which offers the possibility of designing realistic coders for practical multi-channel communication systems.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125491168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Active selection for multi-example querying by content","authors":"A. Natsev, John R. Smith","doi":"10.1109/ICME.2003.1220950","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220950","url":null,"abstract":"Multi-example content-based retrieval (MECBR) is the process of querying content by specifying multiple query examples with single query iteration. MECBR attempts to mitigate some of the semantic limitations of traditional relevance feedback or CBR techniques by allowing multiple query examples and thus a more accurate modeling of the user's query need. It also attempts to minimize the burden on the user, as compared to relevance feedback methods, by eliminating the need for user feedback and limiting all interaction into a single query specification step. Multi-example content-based retrieval is therefore a simple alternative for modeling low- and mid-level semantics without the need for heavy user interaction or extensive training, as in interactive feedback systems or complex statistical modeling approaches. In this paper, we describe the MECBR technique in some detail and study methods for active selection of query examples and query features. In particular, we propose and investigate techniques for automatic query example selection, feature selection, and feature fusion. We compare different approaches and evaluate performance of different parameter settings through an extensive empirical study. We also compare MECBR performance to that of explicitly built semantic models using state-of-the-art support vector machines (SVM). We find that lightweight MECBR performs up to 60% better for rare concepts and only 12% to 25% worse for frequent concepts, as compared to heavy-weight SVM modeling! This shows that MECBR is not only a viable lightweight alternative to statistical semantic modeling but is also preferred for very diverse or rare-class semantic modeling situations.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121283725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}