MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500188
Anthony G. Nguyen, Jenq-Neng Hwang
{"title":"Scene context dependent rate control","authors":"Anthony G. Nguyen, Jenq-Neng Hwang","doi":"10.1145/500141.500188","DOIUrl":"https://doi.org/10.1145/500141.500188","url":null,"abstract":"To prevent the loss of the information embedded in the generated variable bit rate data that is transmitted over a constant bit rate channel, several methods were proposed in MPEG TMN5, H.263 TMN5, and TMN8. In these methods, the quantity of coded data is controlled by adjusting the coding parameters according to the amount of data remaining in the buffer. Because this control is based on the past coded information and does not reflect the nature of the image being coded, there is no assurance that sufficient image quality will be obtained. In this paper, we present a Scene Context Dependent coding scheme to control the generated variable bit rate data over a constant bit rate channel for non real-time video applications and show the improvements over the encoding using the TMN H263 codec. This can be considered as a method to control the bit rate in accordance with the characteristics of human visual perception using the combination of feedforward control, feed-backward control, and model-based approaches.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130931173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500226
Khanh Vu, K. Hua, Jung-Hwan Oh
{"title":"Indexing for efficient processing of noise-free queries","authors":"Khanh Vu, K. Hua, Jung-Hwan Oh","doi":"10.1145/500141.500226","DOIUrl":"https://doi.org/10.1145/500141.500226","url":null,"abstract":"A typical query image contains not only relevant objects, but also irrelevant image areas. The latter, referred to as noise, has limited the effectiveness of existing image retrieval systems. In this paper, we propose a technique that allows users to define arbitrary-shaped queries out of example images. We present a new similarity model, and introduce an indexing technique for this new environment. Our query model is more expressive than the standard query-by-example. The user can draw a contour around a number of objects to specify spatial (relative distance) and scaling (relative size) constraints among them, or use separate contours to disassociate these objects. Our experimental results confirm that traditional approaches, such as Local Color Histogram and Correlogram, suffer from noisy queries. In contrast, our method can leverage arbitrary-shaped queries to offer significantly better performance. This is achieved using only a fraction of the storage overhead required by the other two techniques.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125965512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500149
M. Slaney, D. Ponceleón, James Kaufman
{"title":"Multimedia edges: finding hierarchy in all dimensions","authors":"M. Slaney, D. Ponceleón, James Kaufman","doi":"10.1145/500141.500149","DOIUrl":"https://doi.org/10.1145/500141.500149","url":null,"abstract":"This paper describes a new unified representation for the informa¿tion in a video. We reduce the dimensionality of the signal with either a singular-value decomposition (on the semantic and image data) or mel-frequency cepstral coefficients (on the audio data) and then concatenate the vectors to form a multi-dimensional represen¿tation of the video. Using scale-space techniques we find large jumps in the video's path, which we call edges. We use these tech¿niques to analyze the temporal properties of the audio and image data in a video. This analysis creates a hierarchical segmentation of the video, or a table-of-contents, from the audio, semantic and image data.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"12 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116127722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500181
S. Nepal, Uma Srinivasan, G. Reynolds
{"title":"Automatic detection of 'Goal' segments in basketball videos","authors":"S. Nepal, Uma Srinivasan, G. Reynolds","doi":"10.1145/500141.500181","DOIUrl":"https://doi.org/10.1145/500141.500181","url":null,"abstract":"Advances in the media and entertainment industries, for example streaming audio and digital TV, present new challenges for managing large audio-visual collections. Efficient and effective retrieval from large content collections forms an important component of the business models for content holders and this is driving a need for research in audio-visual search and retrieval. Current content management systems support retrieval using low-level features, such as motion, colour, texture, beat and loudness. However, low-level features often have little meaning for the human users of these systems, who much prefer to identify content using high-level semantic descriptions or concepts. This creates a gap between the system and the user that must be bridged for these systems to be used effectively. The research presented in this paper describes our approach to bridging this gap in a specific content domain, sports video. Our approach is based on a number of automatic techniques for feature detection used in combination with heuristic rules determined through manual observations of sports footage. This has led to a set of models for interesting sporting events-goal segments-that have been implemented as part of an information retrieval system. The paper also presents results comparing output of the system against manually identified goals.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132520446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500230
K. Malan, G. Marsden, E. Blake
{"title":"Visual query tools for uncertain spatio-temporal data","authors":"K. Malan, G. Marsden, E. Blake","doi":"10.1145/500141.500230","DOIUrl":"https://doi.org/10.1145/500141.500230","url":null,"abstract":"Some multimedia archives contain data which have vague locations in time and space. By this we mean that, although there is some idea of when and where the entity is located, the precise information is unknown. In this paper, we present a novel approach to displaying and querying such uncertain data. We use the concepts of dynamic queries, add to this a 2D query tool for performing spatial queries and enable Boolean combinations of queries. We have implemented these ideas in a pilot system for querying African artwork. In this way, we show how it is possible for novice users to easily query large multimedia archives with complex uncertain attributes.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125623108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500235
Qiang Zhu, Jiashi Chen
{"title":"A new approach for rotated face detection","authors":"Qiang Zhu, Jiashi Chen","doi":"10.1145/500141.500235","DOIUrl":"https://doi.org/10.1145/500141.500235","url":null,"abstract":"Human face detection has always been an important problem for face recognition and face tracking. Though considerable attempts have been made to detect and localize faces, these approaches have made assumptions that restrict their extension to more general cases. In this paper we design a novel method to detect a rotated face on the basis of image edge information. Considering the efficiency problem, we propose two key techniques in our approach: first, three points based RHT is applied to detect an ellipse; moreover, we speedup the template matching procedure through calculating the orientation histogram distribution and locating the symmetry axis of a rotated face. In the end, we will validate our idea by several experiment results.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128070725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500234
Stephan Olbrich, H. Pralle
{"title":"A tele-immersive, virtual laboratory approach based on real-time streaming of 3D scene sequences","authors":"Stephan Olbrich, H. Pralle","doi":"10.1145/500141.500234","DOIUrl":"https://doi.org/10.1145/500141.500234","url":null,"abstract":"In this paper we describe a distributed system approach for three-dimensional exploration in the context of high-performance com¿puting, which supports postprocessing, online-visualization and in¿teractive steering scenarios. Our processing chain consists of three instances (data source, streaming server, viewer), which can be dis¿tributed in high-performance networks and operated either in a full pipeline (on-the-fly 3D visualization, computational steering) or in asynchronously running pairs (visualization of prepared 3D scenes). It takes advantage of parallel data extraction, efficient 3D representation, and streaming protocols over TCP/IP.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124562550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500270
J. Jang, Hong-Ru Lee, Jiang-Chun Chen
{"title":"Super MBox: an efficient/effective content-based music retrieval system","authors":"J. Jang, Hong-Ru Lee, Jiang-Chun Chen","doi":"10.1145/500141.500270","DOIUrl":"https://doi.org/10.1145/500141.500270","url":null,"abstract":"This demo presents an implementation of a content-based music retrieval system that can take a user's acoustic input (8-second clip of singing or humming) via a microphone and then retrieve the intended song from a database containing 13,000 candidate songs. The system, known as Super MBox, demonstrates the feasibility of real-time music retrieval with a high recognition rate, which can be used for music search engines over the Internet and/or query engines in digital music libraries or karaoke machines.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122585730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500268
Richard Han, Ching-Yung Lin, John R. Smith, Belle L. Tseng, Vida Ha
{"title":"CPU/power-constrained mobile devices","authors":"Richard Han, Ching-Yung Lin, John R. Smith, Belle L. Tseng, Vida Ha","doi":"10.1145/500141.500268","DOIUrl":"https://doi.org/10.1145/500141.500268","url":null,"abstract":"Due to the limited processing capability, memory constraints, and the power budget of mobile clients, multimedia coders and/or decoders are often difficult to implement on wireless handheld PDAs. In this Universal Tuner project, we designed and implemented a wireless video streaming system that transcodes MPEG-1/2 videos or live TV broadcasting videos to the BW or indexed color Palm OS devices. In our system, the complexity of multimedia compression and decompression algorithms is adaptively partitioned between the encoder and decoder. A mobile client would selectively disable or reenable stages of the algorithm to adapt to the device's effective processing capability. Our variable-complexity strategy of selective disabling of modules supports graceful degradation of the complexity of multimedia coding and decoding into a mobile client's low-power mode, i.e. the clock frequency of its next-generation low power CPU has been scaled down to conserve power. We modified the structure of the standard motion-compensated DCT video codecs to implement a simplified the encoder on a PC server and the decoder on a complexity-constrained PDA viewing client.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129145281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500166
Wei Tsang Ooi, R. V. Renesse
{"title":"Distributing media transformation over multiple media gateways","authors":"Wei Tsang Ooi, R. V. Renesse","doi":"10.1145/500141.500166","DOIUrl":"https://doi.org/10.1145/500141.500166","url":null,"abstract":"Media gateways have been proposed as a solution to the network heterogeneity problem in media multicasting. Services on the gateways transform media streams as they flow through the gateways. In this paper, we present our work on composable services in media gateways. A user can request a computation to be performed on a set of media streams. The system then distributes the computation over multiple gateways for execution. We present an algorithm for decomposing the computation into sub-computations, and an application-level protocol that locates appropriate media gateways to run these sub-computations.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128955664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}