{"title":"Affine-Invariant Image Retrieval Based on Wavelet Interest Points","authors":"Guiguang Ding, Qionghai Dai, Wenli Xu, Feng Yang","doi":"10.1109/MMSP.2005.248678","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248678","url":null,"abstract":"This paper presents an affine-in variant image retrieval approach based on wavelet-based detector, which uses the space-tree property of the transform coefficients to estimate the interest points. Meanwhile, in order to retrieve images compressed by wavelet algorithm such as JPEG2000, the detector only uses the partial bit-planes of the wavelet coefficients to detect the interest points. To provide affine-invariant image matching, annular color histogram, annular texture histogram and spatial cohesion based on interest points are presented to describe image features. A series of experiments based on an image database consisting of 1000 images are performed to confirm the effectiveness of our method","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129194858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeled Analysis on Requantization Error","authors":"Bo Shen","doi":"10.1109/MMSP.2005.248627","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248627","url":null,"abstract":"Requantization has many applications in the processing of compressed multimedia content. However, no work has thorough analysis on its dynamics. A previous work has only numerically evaluated this process since it is found difficult to track analytically. We present a new approach in this paper to carry out an analytical study based on analysis of the intervals using number theory. This work proves some of the insights proposed before as propositions. The theoretical results also serve as a foundation in further algorithm development for requantization-related applications","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116495375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed Coding of Spherical Images with Jointly Refined Decoding","authors":"T. Tillo, B. Penna, P. Frossard, P. Vandergheynst","doi":"10.1109/MMSP.2005.248618","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248618","url":null,"abstract":"This work addresses the coding of 3-dimensional scenes, as captured by distributed vision sensors with catadioptric cameras. Spherical images allow for avoiding distortion due to the common euclidian assumption in the representation of the plenoptic function. We consider here low complexity encoding of the sensor outputs, in a framework where the cameras could be placed anywhere in the scene, and where the sensors do not communicate to each other. Since multiple spherical images of the same scene most probably provide a redundant representation, we propose to have different compression ratios for different cameras, in order to reduce the overhead of information. The decoder performs a joint decoding of the multiples images, by motion estimation, and joint refinement by consistent inverse quantization. It is finally shown that, even in the absence of any information about the scene or the position of the cameras, the proposed scheme offers improved performance with respect to an independent encoding of the spherical images, especially at low coding rate","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124123547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Flexible Multi-Rate Allocation Scheme for Balanced Multiple Description Coding Applications","authors":"T. Tillo, E. Baccaglini, G. Olmo","doi":"10.1109/MMSP.2005.248579","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248579","url":null,"abstract":"When transmitting multimedia information over non-prioritized networks subject to packet losses, multiple description coding with an arbitrary number of descriptions is an effective choice in order to minimize the end-to-end distortion. Such descriptions should be generated so that the quality obtained decoding a subset of them depends only on their number and not on the particular received subset. In this paper, we propose an encoding procedure to generate an arbitrary number of balanced descriptions using a multi-rate allocation scheme which exploits the R-D characteristic of the data","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"44 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134288867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Requantization in Intra-Frame Video Transcoding with Different Transform Block Sizes","authors":"J. Bialkowski, M. Barkowsky, André Kaup","doi":"10.1109/MMSP.2005.248669","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248669","url":null,"abstract":"Transcoding is a technique to convert one video bit-stream into another. While homogeneous transcoding is done at the same coding standard, inhomogeneous transcoding converts from one standard format to another standard. Inhomogeneous transcoding between MPEG-2, MPEG-4 or H.263 was performed using the same transform. With the standardisation of H.264 also a new transform basis and different block size was defined. For requantization from block size 8times8 to 4times4 this leads to the effect that the quantization error of one coefficient in a block of size 8times8 is distributed over multiple coefficients in blocks of size 4times4. In our work, we analyze the requantization process for inhomogeneous transcoding with different transforms. The deduced equations result in an expression for the correlation of the error contributions from the coefficients of block size 8times8 at each coefficient of block size 4times4. We then compare the mathematical analysis to simulations on real sequences. The reference to the requantization process is the direct quantization of the undistorted signal. It will be shown that the loss is as high as 3 dB PSNR at equivalent step size for input and output bitstream. Also an equation for the choice of the second quantization step size in dependency of the requantization loss is deduced. The model is then extended from the DCT to the integer-based transform as defined in H.264","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132946176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face Recognition Using Recursive Cluster-Based Linear Discriminant","authors":"C. Xiang, Dong Huang","doi":"10.1109/MMSP.2005.248638","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248638","url":null,"abstract":"Two new recursive procedures for extracting discriminant features, termed recursive modified linear discriminant (RMLD) and recursive cluster-based linear discriminant (RCLD) are proposed in this paper. The two new methods, RMLD and RCLD overcome two major shortcomings of fisher linear discriminant (FLD): it can fully exploit all information available for discrimination; and it removes the constraint on the total number of features that can be extracted. Experiments of comparing the new algorithm with the traditional FLD and some of its variations have been carried out on various types of face recognition problems for Yale database, in which the resulting improvement of the performances by the new feature extraction scheme is significant","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126585600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A DCT-Domain Video Alignment Technique for MPEG Sequences","authors":"Ming-Sui Lee, Mei-Yin Shen, C.-C. Jay Kuo","doi":"10.1109/MMSP.2005.248599","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248599","url":null,"abstract":"An image/video registration technique for multiple compressed video inputs such as MPEG sequences is investigated. The proposed technique is based on the matching of discrete cosine transform (DCT) coefficients and motion vectors. First, the I frame of each input sequence is separated into the background and moving objects. For the background, coarse edge features are extracted by applying edge detectors of different characteristics to the luminance DC coefficients. Each detector generates a difference map for a single background. A threshold is determined for each difference map to produce a binary map. Then, alignment parameters are determined using the binary maps of input images generated by the same detector. For the moving object, alignment parameters can be finetuned by the motion information of all frames in the same group of pictures (GOP). Finally, the actual displacement in the pixel domain is estimated by the weighted average of alignment parameters from all background detectors and refinement parameters from motion information. It is shown by experimental results that the proposed method reduces the computational cost of image/video registration significantly in comparison with the traditional pixel domain registration techniques while achieving certain quality of composition","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126592566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combining Stream Switching with Fine-Grained Intra-Stream Adaptation for Adaptive Video Streaming","authors":"Michael Kropfberger, H. Hellwagner","doi":"10.1109/MMSP.2005.248654","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248654","url":null,"abstract":"Video streaming systems in best effort networks have to somehow cope with dynamically changing bandwidth. Various scalable video codecs allow intra-stream adaptation by use of temporal, spatial, or quality (SNR) scalability; optimizations for finer grained scalability are available as layered coding and FGS techniques. However, if there is no scalable video stream at hand, stream switching among pre-encoded stream versions of different bitrates and qualities allows at least coarse-grained adaptation. Those different approaches compete to be the most efficient solution for adaptive video streaming. However, this paper will show that the efficacy is significantly increased by combining those approaches. As will be discussed, the combination of coarse-grained stream switching and temporal intra-stream adaptation offers better visual results and more stable client buffer behavior than the denoted approaches used separately","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133529672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image Steganalysis Based on Statistical Moments of Wavelet Subband Histograms in DFT Domain","authors":"Guorong Xuan, Jianjiong Gao, Y. Shi, D. Zou","doi":"10.1109/MMSP.2005.248584","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248584","url":null,"abstract":"This paper proposed an image Steganalysis scheme based on statistical moments of histogram of multi-level wavelet subbands in frequency domain. Our theoretical analysis has pointed out that the statistical moments in frequency domain of histogram is more sensitive to data embedding than the statistical moments of histogram in spatial domain. We test the performance of our proposed scheme over non-blind spread spectrum (SS) data hiding method, blind SS method, block based SS method, LSB method and QIM data hiding methods. Besides, steganographic tools such as Outguess, JSteg and F5 are tested. The experimental results have showed that the proposed method outperforms the prior arts by Farid and Harmsen","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115752184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BASS: BitTorrent Assisted Streaming System for Video-on-Demand","authors":"C. Dana, Danjue Li, David Harrison, C. Chuah","doi":"10.1109/MMSP.2005.248586","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248586","url":null,"abstract":"This paper introduces a hybrid server/P2P streaming system called bittorrent-assisted streaming system (BASS) for large-scale video-on-demand (VoD) services. By distributing the load among P2P connections as well as maintaining active server connections, BASS can increase the system scalability while decreasing media playout wait times. To analyze the benefits of BASS, we examine torrent trace data collected in the first week of distribution for Fedora Core 3 and develop an empirical model of bittorrent client performance. Based on this, we run trace-based simulations to evaluate BASS and show that it is more scalable than current unicast solutions and can greatly decrease the average waiting time before playback","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115012470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}