{"title":"The MoCA Workbench: support for creativity in movie content analysis","authors":"R. Lienhart, S. Pfeiffer, W. Effelsberg","doi":"10.1109/MMCS.1996.534993","DOIUrl":"https://doi.org/10.1109/MMCS.1996.534993","url":null,"abstract":"Semantic access to the content of a video is highly desirable for multimedia content retrieval. Automatic extraction of semantics requires content analysis algorithms. Our MoCA (Movie Content Analysis) project provides an interactive workbench supporting the researcher in the development of new movie content analysis algorithms. The workbench offers data management facilities for large amounts of video/audio data and derived parameters. It also provides an easy-to-use interface for a free combination of basic operators into more sophisticated operators. We can combine results from video track and audio track analysis. The MoCA Workbench shields the researcher from technical details and provides advanced visualization capabilities, allowing attention to focus on the development of new algorithms. The paper presents the design and implementation of the MoCA Workbench and reports practical experience.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116936495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuh-Lin Chang, Wenjun Zeng, I. Kamel, Rafael Alonso
{"title":"Integrated image and speech analysis for content-based video indexing","authors":"Yuh-Lin Chang, Wenjun Zeng, I. Kamel, Rafael Alonso","doi":"10.1109/MMCS.1996.534992","DOIUrl":"https://doi.org/10.1109/MMCS.1996.534992","url":null,"abstract":"We study an important problem in multimedia database, namely the automatic extraction of indexing information from raw data based on video contents. The goal of our research project is to develop a prototype system for automatic indexing of sports videos. The novelty of our work is that we propose to integrate speech understanding and image analysis algorithms for extracting information. The main thrust of this work comes from the observation that in news or sports video indexing, usually speech analysis is more efficient in detecting events than image analysis. Therefore, in our system, the audio processing modules are first applied to locate candidates in the whole data. This information is passed to the video processing modules, which further analyze the video. The final products of video analysis are in the form of pointers to the locations of interesting events in a video. Our algorithms have been tested extensively with real TV programs, and results are presented and discussed.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124025525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis on disk scheduling for special user functions","authors":"K. Ng, A. Yeung","doi":"10.1109/MMCS.1996.535029","DOIUrl":"https://doi.org/10.1109/MMCS.1996.535029","url":null,"abstract":"Previous studies on disk scheduling for video services are usually based on computer simulation. We present an analysis of disk scheduling for video services. The purpose of the analysis is to obtain the maximum number of simultaneous video streams which can be supported by a disk using CLOOK or LOOK algorithm. The analysis is then extended to study disk scheduling for special user functions such as 'forward search' and 'reverse search'. Analysis on a technique called REdundancy for Special User Functions (RESUF) is also performed and the results show that the technique can keep the I/O demands to almost constant under all user request conditions.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117211576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting story units from long programs for video browsing and navigation","authors":"M. Yeung, B. Yeo, Bede Liu","doi":"10.1109/MMCS.1996.534991","DOIUrl":"https://doi.org/10.1109/MMCS.1996.534991","url":null,"abstract":"Content based browsing and navigation in digital video collections have been centered on sequential and linear presentation of images. To facilitate such applications, nonlinear and non sequential access into video documents is essential, especially with long programs. For many programs, this can be achieved by identifying underlying story structures which are reflected both by visual content and temporal organization of composing elements. A new framework of video analysis and associated techniques are proposed to automatically parse long programs, to extract story structures and identify story units. The proposed analysis and representation contribute to the extraction of scenes and story units, each representing a distinct locale or event, that cannot be achieved by shot boundary detection alone. Analysis is performed on MPEG compressed video and without a prior models. The result is a compact representation that serves as a summary of the story and allows hierarchical organization of video documents.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127972439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a theory of collaborative multimedia","authors":"K. Candan, V. S. Subrahmanian, P. Rangan","doi":"10.1109/MMCS.1996.534987","DOIUrl":"https://doi.org/10.1109/MMCS.1996.534987","url":null,"abstract":"We develop a theory of media objects, and present optimal algorithms for collaborative synthesis of media objects. We then extend the algorithms to incorporate quality constraints (such as image size) as well as distribution across multiple nodes. The theoretical model is validated by an experimental implementation that supports the theoretical results.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128017748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A method to fuse two kinds of digital road maps","authors":"Y. Ohsawa, Masako Miyazaki","doi":"10.1109/MMCS.1996.535898","DOIUrl":"https://doi.org/10.1109/MMCS.1996.535898","url":null,"abstract":"The acquisition of digital road maps is essential for automobile navigation. Digital road maps are used in Japan by the Digital Road Map Association (DRMA) based on 1/25,000 map. The digital map, however, is unsuitable for town driving. In this situation a more detailed digital road map is requested by the users. To get a detailed digital road map from a paper map is a very expensive operation. We proposed a new method to obtain a more precise digital road map using a more detailed general purpose digital map which has already been published. However, using this digital map directly for auto-navigation is impossible. The method we propose in this paper obtains more detailed road network data from the 1/10,000 map and from the DRMA data.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132495292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Arikawa, Akira Amano, Kaori Maeda, R. Aibara, S. Shimojo, Yasuaki Nakamura, K. Hiraki, K. Nishimura, M. Terauchi
{"title":"Dynamic LoD for QoS management in the next generation VRML","authors":"M. Arikawa, Akira Amano, Kaori Maeda, R. Aibara, S. Shimojo, Yasuaki Nakamura, K. Hiraki, K. Nishimura, M. Terauchi","doi":"10.1109/MMCS.1996.534950","DOIUrl":"https://doi.org/10.1109/MMCS.1996.534950","url":null,"abstract":"A high speed computer network will provide us with new broadband multimedia applications. This paper discusses new functions for the next generation VRML (Virtual Reality Modeling Language) over high speed computer networks. The LoD (Level of Detail) of 3D objects is the most important function for rendering scenes dynamically while managing the QoS (Quality of Service). New requirements for the next generation VRML are discussed. We present Differential VRML (DVRML) in order to update scene graphs dynamically, and describe principles of the LoD function based on the DVRML.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124943852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast algorithms for compositing multi-way compressed video","authors":"Z. Maojun, H. Xiaofeng, Yang Bing, Ku Xishu","doi":"10.1109/MMCS.1996.535019","DOIUrl":"https://doi.org/10.1109/MMCS.1996.535019","url":null,"abstract":"This paper presents fast algorithms for compositing two or more videos which have been compressed using the JPEG (Joint Picture Experts Group) compression standard. Their performance is improved over traditional methods by processing compressed video data in the frequency domain, avoiding the DCT (discrete cosine transform) and IDCT (inverse DCT). Since the algorithms are built on a new theoretical basis, the computational speedup and memory usage efficiency are further improved over existing fast algorithms.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"2 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128690885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Consumption-based buffer management for maximizing system throughput of a multimedia system","authors":"Kun-Lung Wu, Philip S. Yu","doi":"10.1109/MMCS.1996.534970","DOIUrl":"https://doi.org/10.1109/MMCS.1996.534970","url":null,"abstract":"In a multimedia server, multiple media streams are generally serviced in a cyclic fashion. Due to non-uniform playback rates and asynchronous arrivals of queries, there tends to be spare disk bandwidth in each service cycle. In this paper, we study the issue of dynamically utilizing the spare disk bandwidth and buffer to maximize the system throughput of a multimedia server. We introduce the concept of minimizing buffer consumption to select an appropriate media stream to use the spare disk bandwidth. The buffer consumption measures both the amount of buffer and the amount of time such a buffer is occupied (i.e. the space-time product). Different alternatives to utilizing spare disk bandwidth are examined, including rate-adjustable retrievals of an already-activated stream and prefetching the next waiting stream. Simulations are conducted to evaluate and compare different cases. The recruits show that (1) minimizing buffer consumption is a good criterion for maximizing the system throughput, and (2) in general, prefetching a waiting stream incurs more buffer consumption, and thus is less effective than rate-adjustable retrievals.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122477271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Failure recovery for multi-party real-time communication","authors":"Amit Gupta, K. Rothermel","doi":"10.1109/MMCS.1996.535023","DOIUrl":"https://doi.org/10.1109/MMCS.1996.535023","url":null,"abstract":"For real-time communication services to achieve widespread usage, it is important that the network services behave gracefully when any component fails. In this paper, we describe techniques and mechanisms for maintaining network services for multi-party real-time communication in the face of failures that may make parts of the network inaccessible. These protocols provide high performance during normal operations and the network performance degrades gracefully in face of network failures; e.g. in the presence of failures, the routes selected may not be optimal, the connection set-up may take a little more time, or resource allocation may be less efficient. With these mechanisms, the real-time communication protocols achieve robustness to single and/or multiple failures in the network, without diluting the strength of the performance guarantees offered or sacrificing the system performance in the common case.","PeriodicalId":371043,"journal":{"name":"Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116128066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}