{"title":"A spectral clustering approach to motion segmentation based on motion trajectory","authors":"Hongbin Wang, Hua Lin","doi":"10.1109/ICME.2003.1221736","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221736","url":null,"abstract":"Multibody motion segmentation is important in many computer vision tasks. This paper presents a novel spectral clustering approach to motion segmentation based on motion trajectory. We introduce a new affinity matrix based on the motion trajectory and map the feature points into a low dimensional subspace. The feature points are clustered in this subspace using a graph spectral approach. By computing the sensitivities of the larger eigenvalues of a related Markov transition matrix with respect to perturbations in affinity matrix, we improve the piecewise constant eigenvectors condition [M. Meila et al., 2001] dramatically. This makes clustering much reliable and robust. We confirm it by experiments.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130745929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Nakayama, M. Konda, K. Takeuchi, K. Kotani, T. Ohmi
{"title":"Adaptive resolution vector quantization technique and basic codebook design method for compound image compression","authors":"T. Nakayama, M. Konda, K. Takeuchi, K. Kotani, T. Ohmi","doi":"10.1109/ICME.2003.1221675","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221675","url":null,"abstract":"In order to increase the performance of image compression by vector quantization (VQ), we propose a systematic codebook design method without using learning sequences for 4/spl times/4 and 2/spl times/2 pixel blocks. According to the method, the codebook can be applied to all kinds of images and exhibits equivalent compression performance to the specific codebooks created individually by conventional learning method using corresponding images. Furthermore, we have developed a novel VQ-based image-coding algorithm suitable for compound images. Adaptive resolution VQ (AR-VQ) method, which is composed of three key techniques, i.e., the edge detection, the resolution conversion, and the entropy coding, can realize much superior compression performance than the JPEG and the JPEG-2000. On the compression of the XGA (1024/spl times/768 pixels) images including text, for instance, there exist an overwhelming performance difference of 5 to 40 dB in compressed image quality.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133033687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Edge-based semantic classification of sports video sequences","authors":"Michael H. Lee, S. Nepal, Uma Srinivasan","doi":"10.1109/ICME.2003.1220878","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220878","url":null,"abstract":"This paper presents an edge-based semantic classification of sports video sequences. The paper presents an algorithm for edge detection, and illustrates the usage of edges for semantic analysis of video content. We first propose an algorithm for detecting edges within video frames directly on the MPEG format without a decompression process. The algorithm is based on a spatial-domain synthetic edge model, which is defined using interrelationship of two DCT edge features: horizontal and vertical. We then use a multi-step approach to classify video sequences into meaningful semantic segments such as \"goal\", \"foul\", and \"crowd\" in basketball games using the \"edgeness\" criteria. We then show how an audio feature (\"whistles\") can be used as a filter to enhance edge-based semantic classification.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133685793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An improved block dependent fragile image watermark","authors":"Liu Feilong, Wang Yangsheng","doi":"10.1109/ICME.2003.1221663","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221663","url":null,"abstract":"Surrounding neighbor blocks dependent fragile watermark scheme has its disadvantages, which couldn't successfully distinguish where the image has been tampered. In this paper, we propose a random block dependant fragile image watermark scheme. Two series, one is generated by this block and its prior block, the other is generated by this block itself, are embedded into the LSBs of this image block to keep itself secure and dependent on its prior block, which make VQ attack impossible. Decision strategy is implemented to detect whether the image blocks are authentic in tamper detection. Analysis and experimental results demonstrate that our algorithm can detect even one bit image alteration with graceful localization, and successfully resist vector quantization attack simultaneously without the need of any unique keys or image indexes.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132737083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lossless and lossy minimal redundancy pyramidal decomposition for scalable image compression technique","authors":"Marie Babel, O. Déforges","doi":"10.1109/ICME.2003.1221273","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221273","url":null,"abstract":"We present a new scalable compression technique dealing simultaneously with both lossy and lossless image coding. An original DPCM scheme with refined context is introduced through a pyramidal decomposition adapted to the LAR (locally adaptive resolution) method, which becomes by this way fully progressive. An implicit context modeling of the prediction errors, due to the low resolution image representation including variable block size structure, is then exploited to the for lossless compression purpose.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132754553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Eye gaze and speech for data entry: a comparison of different data entry methods","authors":"Y. Tan, N. Sherkat, Tony Allen","doi":"10.1109/ICME.2003.1220849","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220849","url":null,"abstract":"In this paper we present a multimodal interface that employs speech recognition and eye gaze tracking technology for use in data entry tasks. The aim of this work is to compare the usability of this multimodal system against other data entry methods (handwriting, mouse and keyboard and speech only) when carrying out the data entry task of filling a form. Discussions regarding the relationships between efficiency, effectiveness, ergonomic quality, hedonic quality, naturalness, familiarity and users preference are presented. The experimental results show that the majority of the users prefer using the proposed eye and speech system compared to the other form-filling methods even though such a method is neither the fastest nor the most accurate.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133263840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel method to embed watermark in different halftone images: data hiding by conjugate error diffusion (DHCED)","authors":"M. Fu, O. Au","doi":"10.1109/ICME.2003.1220991","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220991","url":null,"abstract":"In this paper, we propose a novel way called DHCED to hide invisible patterns in two or more visually different halftone images (e.g. Lena and Harbor) such that the hidden patterns would appear on the halftone images when they are overlaid. Conjugate error diffusion is used to embed the binary visual pattern in the two distinct halftone images. Simulation results show that the two halftone images have good visual quality, and the hidden pattern is visible when the two distinct halftone images are overlaid.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131398194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast stitching algorithm for moving object detection and mosaic construction","authors":"J. Hsieh","doi":"10.1109/ICME.2003.1220860","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220860","url":null,"abstract":"This paper proposes a novel edge-based stitching method to detect moving objects and construct mosaics from images. The method is a coarse-to-fine scheme which first estimates a good initialization of camera parameters with two complementary methods and then refines the solution through an optimization process. The two complementary methods are the edge alignment and correspondence-based approaches, respectively. Since these two methods are complementary to each other, the desired initial estimate can be obtained more robustly. After that, a Monte-Carlo style method is then proposed for integrating these two methods together. Then, an optimization process is applied to refine the above initial parameters. Since the found initialization is very close to the exact solution and only errors on feature positions are considered for minimization, the optimization process can be very quickly achieved. Experimental results are provided to verify the superiority of the proposed method.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131868656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3D face modeling using two orthogonal views and a generic face model","authors":"A. Ansari, M. Abdel-Mottaleb","doi":"10.1109/ICME.2003.1221305","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221305","url":null,"abstract":"We present an algorithm for 3-D face modeling from a frontal and a profile view images of a person's face. The algorithm starts by computing the 3D coordinates of automatically extracted facial feature points. The coordinates of the selected feature points are then used to deform a 3D generic face model to obtain a 3D face model for that person. Procrustes analysis is used to globally minimize the distance between facial feature vertices in the model and the corresponding 3D points obtained from the images. Then, local deformation is performed on the facial feature vertices to obtain a more realistic 3D model for the person. Preliminary experiments to asses the applicability of the models for face recognition show encouraging results.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115654563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Event-based home photo retrieval","authors":"Joo-Hwee Lim, P. Mulhem, Q. Tian","doi":"10.1109/ICME.2003.1221546","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221546","url":null,"abstract":"With rapid advances in sensor, storage, processor, and communication technologies, consumers can now afford to create, store, process, and share large digital photo collections. With more and more digital photos accumulated, consumers need effective and efficient tools to organize and access photos in a semantically meaningful way without too much manual annotation effort. From user studies, we confirm that users prefer to organize and access photos along semantic axes such as event, people, time, and place. In this paper, we propose a computational learning framework to construct event models from sample photos with event labels given by a user and to compute relevance measures of unlabeled photos to the event models. We demonstrate event-based retrieval on 2400 genuine home photos using our proposed approach.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115901700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}