MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500219
Junchul Chun, G. Stockman
{"title":"Subband image segmentation using VQ for content-based image retrieval","authors":"Junchul Chun, G. Stockman","doi":"10.1145/500141.500219","DOIUrl":"https://doi.org/10.1145/500141.500219","url":null,"abstract":"Retrieving images from a large image dataset using image content as a key is an important issue. In this paper, we present a new content-based image retrieval approach using a Wavelet transform and subband image segmentation. For the image retrieval, we first decompose the image using a Wavelet transform and adopt vector a quantization(VQ) algorithm to perform automatic segmentation based on image features such as color and texture. The wavelet transform decomposes the image into 4 subbands(LL,LH,HL,HH). Only the LL component is further decomposed until the desired depth is reached. The image segmentation is performed using the HIS color and texture features of the low pass sub-band component image. The VQ provides a transformation from the raw pixel data to a small group of homogeneous classes which are coherent in color and texture space. For managing a large image dataset, image compression is usually considered. In that sense, the segmentation of a compressed image or sub-band image is more efficient compared with using an uncompressed image since the compressed image preserves the information needed for the image segmentation task. An important aspect of the system is that using a sub-band image of the Wavelet transform can reduce the size and noise of the image. Thus, we can subsequently reduce the computational burden for the image segmentation. The experimental results of the proposed image retrieval system confirm the feasibility of our approach in retrieving accuracy and in lowering computational cost compared to using the original image.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126550797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500150
Gerald Kühne, Stephan Richter, Markus Beier
{"title":"Motion-based segmentation and contour-based classification of video objects","authors":"Gerald Kühne, Stephan Richter, Markus Beier","doi":"10.1145/500141.500150","DOIUrl":"https://doi.org/10.1145/500141.500150","url":null,"abstract":"The segmentation of objects in video sequences constitutes a prerequisite for numerous applications ranging from computer vision tasks to second-generation video coding.We propose an approach for segmenting video objects based on motion cues. To estimate motion we employ the 3D structure tensor, an operator that provides reliable results by integrating information from a number of consecutive video frames. We present a new hierarchical algorithm, embedding the structure tensor into a multiresolution framework to allow the estimation of large velocities.The motion estimates are included as an external force into a geodesic active contour model, thus stopping the evolving curve at the moving object's boundary. A level set-based implementation allows the simultaneous segmentation of several objects.As an application based on our object segmentation approach we provide a video object classification system. Curvature features of the object contour are matched by means of a curvature scale space technique to a database containing preprocessed views of prototypical objects.We provide encouraging experimental results calculated on synthetic and real-world video sequences to demonstrate the performance of our algorithms.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126636043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500157
Peng Wu, B. S. Manjunath
{"title":"Adaptive nearest neighbor search for relevance feedback in large image databases","authors":"Peng Wu, B. S. Manjunath","doi":"10.1145/500141.500157","DOIUrl":"https://doi.org/10.1145/500141.500157","url":null,"abstract":"Relevance feedback is often used in refining similarity retrievals in image and video databases. Typically this involves modification to the similarity metrics based on the user feedback and recomputing a set of nearest neighbors using the modified similarity values. Such nearest neighbor computations are expensive given that typical image features, such as color and texture, are represented in high dimensional spaces. Search complexity is a ciritcal issue while dealing with large databases and this issue has not received much attention in relevance feedback research. Most of the current methods report results on very small data sets, of the order of few thousand items, where a sequential (and hence exhaustive search) is practical. The main contribution of this paper is a novel algorithm for adaptive nearest neigbor computations for high dimensional feature vectors and when the number of items in the databse is large. The proposed method exploits the correlations between two consecutive nearest neighbor searches when the underlying similarity metric is changing, and filters out a significant number of candidates ina two stage search and retrieval process, thus reducing the number of I/O accesses to the database. Detailed experimental results are provided using a set of about 700,000 images. Comparision to the existing method shows an order of magnitude overall imporovement.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116935720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500210
Zhijun Lei
{"title":"Media transcoding for pervasive computing","authors":"Zhijun Lei","doi":"10.1145/500141.500210","DOIUrl":"https://doi.org/10.1145/500141.500210","url":null,"abstract":"The rapid development of wireless technologies and computer-embedded devices make it possible for people to use portable devices accessing multimedia information and service. In order to bring multimedia information and service to the various client devices while retaining the ability to go mobile, multimedia information must be adapted, which is referred to as media transcoding technology. In this paper, some related issues of media transcoding are discussed. Media transcoding techniques are classified from different perspectives. Then a general framework for media transcoding is proposed and explored. Video transcoding is discussed as the major research focus. Some methods in terms of video rate control, spatial resolution reduction, and heterogeneous video transcoding are explored.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123132332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500257
Xiaojun Shen, S. Nourian, Isabelle Hertanto, N. Georganas
{"title":"vCOM: virtual commerce in a collaborative 3D world","authors":"Xiaojun Shen, S. Nourian, Isabelle Hertanto, N. Georganas","doi":"10.1145/500141.500257","DOIUrl":"https://doi.org/10.1145/500141.500257","url":null,"abstract":"Existing electronic commerce applications only provide the user with a relatively simple browser-based interface to access products. Buyers, however, are not provided with the same shopping experience, as they would have in an actual store or shopping mall. With the creation of a virtual shopping mall, simulations of most of the actual shopping environments and user interactions can be achieved. The virtual mall brings together the services and inventories of various vendors. Users can either navigate through the vendors, adding items into a virtual shopping cart, or perform intelligent searches through \"user and vendor agents\". The electronic mall prototype also allows the user to communicate with an \"intelligent assistant\" (IA) using simple voice commands. This assistant interacts with the shopper using voice synthesis and helps him or her use the interface to navigate efficiently in the mall. Real-time interactions among entities in the virtual environment are implemented over the Run Time Infrastructure of the High Level Architecture (RTI/HLA), an OMG and IEEE standard for distributed simulations and modeling developed by the US Department of Defense.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128907048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500146
Shingo Uchihashi
{"title":"Improvising camera control for capturing meeting activities using a floor plan","authors":"Shingo Uchihashi","doi":"10.1145/500141.500146","DOIUrl":"https://doi.org/10.1145/500141.500146","url":null,"abstract":"This paper describes camera control interfaces for capturing meetings and presentations into multimedia documents. While technologies are maturing to deliver multimedia documents over network, skilled human hands are still required to create the contents. We had dug into the problem and found that some portion of it derives from current camera control systems, which only provide interfaces for incremental navigations. Presets are provided for some systems to avoid cumbersome manipulations, but the difficulty to improvise controls remains untouched. We introduced an interface using floor plan for selecting an arbitrary area to be captured. We also conducted a study to compare our method with other two typical camera control interfaces. The results revealed that our method was significantly better than a typical joystick-metaphor interface. Although an interface with presets resulted superior for completing tasks in limited conditions, the participants judge the use of a floor plan to be equally good in respect to ease of camera navigation as they intended. They also indicated possible improvement for our interface to close in on the one with presets.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115768112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500184
M. K. Bradshaw, Bing Wang, Lixin Gao, J. Kurose, P. Shenoy, D. Towsley, S. Sen
{"title":"Periodic broadcast and patching services: implementation, measurement, and analysis in an internet streaming video testbed","authors":"M. K. Bradshaw, Bing Wang, Lixin Gao, J. Kurose, P. Shenoy, D. Towsley, S. Sen","doi":"10.1145/500141.500184","DOIUrl":"https://doi.org/10.1145/500141.500184","url":null,"abstract":"Multimedia streaming applications can consume a significant amount of server and network resources. Periodic broadcast and patching are two approaches that use multicast transmission and client buffering in innovative ways to reduce server and network load, while at the same time allowing asynchronous access to multimedia steams by a large number of clients. Current research in this area has focussed primarily on the algorithmic aspects of these approaches, with evaluation performed via analysis or simulation. In this paper, we describe the design and implementation of a flexible streaming video server and client testbed that implements both periodic broadcast and patching, and explore the issues that arise when implementing these algorithms. We present measurements detailing the overheads associated with the various server components (signaling, transmission schedule computation, data retrieval and transmission), the interactions between the various components of the architecture, and the overall end-to-end performance. We also discuss the importance of an appropriate server video segment caching policy. We conclude with a discussion of the insights gained from our implementation and experimental evaluation.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125727609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500227
Kun Tan, Richard Ribier, S. Liou
{"title":"Content-sensitive video streaming over low bitrate and lossy wireless network","authors":"Kun Tan, Richard Ribier, S. Liou","doi":"10.1145/500141.500227","DOIUrl":"https://doi.org/10.1145/500141.500227","url":null,"abstract":"","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128531267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500158
Zhong-Ming Su, S. Li, HongJiang Zhang
{"title":"Extraction of feature subspaces for content-based retrieval using relevance feedback","authors":"Zhong-Ming Su, S. Li, HongJiang Zhang","doi":"10.1145/500141.500158","DOIUrl":"https://doi.org/10.1145/500141.500158","url":null,"abstract":"In the past few years, relevance feedback (RF) has been used as an effective solution for content-based image retrieval (CBIR). Although effective, the RF-CBIR framework does not address the issue of feature extraction for dimension reduction and noise reduction. In this paper, we propose a novel method for extracting features for the class of images represented by the positive images provided by subjective RF. Principal Component Analysis (PCA) is used to reduce both noise contained in the original image features and dimensionality of feature spaces. The method increases the retrieval speed and reduces the memory significantly without sacrificing the retrieval accuracy.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130881413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500218
Michael G. Christel, Chang Huang
{"title":"SVG for navigating digital news video","authors":"Michael G. Christel, Chang Huang","doi":"10.1145/500141.500218","DOIUrl":"https://doi.org/10.1145/500141.500218","url":null,"abstract":"Scalable Vector Graphics (SVG) is a language for describing two-dimensional graphics in XML, specifically vector graphic shapes, images, and text. SVG is a new World Wide Web Consortium (W3C) Candidate Recommendation as of November 2000, and this paper describes how SVG provides an ideal framework for presenting manipulable, interactive summarizations into a multimedia information repository. Specifically, we present VIBE and map SVG interfaces into a digital news video library for delivery through web browsers. Pan-and-zoom visualizations of video through SVG are discussed.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130565743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}