MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319690
S. Mukhopadhyay, B. Smith
{"title":"Passive capture and structuring of lectures","authors":"S. Mukhopadhyay, B. Smith","doi":"10.1145/319463.319690","DOIUrl":"https://doi.org/10.1145/319463.319690","url":null,"abstract":"Despite recent advances in authoring systems and tools, creating multimedia presentations remains a labor-intensive process. This paper describes a system for automatically constructing structured multimedia documents from live presentations. The automatically produced documents contain synchronized and edited audio, video, images, and text. Two essential problems, synchronization of captured data and automatic editing, are identified and solved.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125474989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319601
D. Eager, M. Vernon, J. Zahorjan
{"title":"Optimal and efficient merging schedules for video-on-demand servers","authors":"D. Eager, M. Vernon, J. Zahorjan","doi":"10.1145/319463.319601","DOIUrl":"https://doi.org/10.1145/319463.319601","url":null,"abstract":"The simplest video-on-demand (VOD) delivery policy is to allocate a new media delivery stream to each client request when it arrives. This policy has the desirable properties of “immediate service” (there is minimal latency between the client request and the start of playback, assuming that sufficient server bandwidth is available to start the new stream), of placing minimal demands on client capabilities (the client receive bandwidth required is the media playback rate, and no client local storage is required), and of being simple to implement. However, the policy is untenable because it requires server bandwidth that scales linearly with the number of clients that must be supported simultaneously, which is too expensive for many applications.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115205423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319658
S. Srinivasan, D. Petkovic, D. Ponceleón
{"title":"Towards robust features for classifying audio in the CueVideo system","authors":"S. Srinivasan, D. Petkovic, D. Ponceleón","doi":"10.1145/319463.319658","DOIUrl":"https://doi.org/10.1145/319463.319658","url":null,"abstract":"The role of audio in the context of multimedia applications involving video is becoming increasingly important. Many efforts in this area focus on audio data that contains some built-in semantic information structure such as in broadcast news, or focus on classification of audio that contains a single type of sound such as cleaar speech or clear music only. In the CueVideo system, we detect and classify audio that consists of mixed audio, i.e. combinations of speech and music together with other types of background sounds. Segmentation of mixed audio has applications in detection of story boundaries in video, spoken document retrieval systems, audio retrieval systems etc. We modify and combine audio features known to be effective in distinguishing speech from music, and examine their behavior on mixed audio. Our preliminary experimental results show that we can achieve a classification accuracy of over 80% for such mixed audio. Our study also provides us with several helpful insights related to analyzing mixed audio in the context of real applications.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121749690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319621
H. Luo, A. Eleftheriadis
{"title":"Designing an interactive tool for video object segmentation and annotation","authors":"H. Luo, A. Eleftheriadis","doi":"10.1145/319463.319621","DOIUrl":"https://doi.org/10.1145/319463.319621","url":null,"abstract":"An interactive authoring system is proposed for semi-automatic video object segmentation and annotation. This system features a new contour interpolation algorithm, which enables the user to define the contour of a video object on multiple frames while the computer interpolates the missing contours of this object on every frame automatically. Typical active contour (snake) model is adapted and the contour interpolation problem is decomposed into two directional contour tracking problems and a merging problem. In addition, new user interaction models are created for the user to interact with the computer. Experiments indicate that this system offers a good balance between algorithm complexity and user interaction efficiency.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122394870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319678
S. Campos, B. Ribeiro-Neto, Autran Macêdo, Luciano Bertini
{"title":"Formal verification and analysis of multimedia systems","authors":"S. Campos, B. Ribeiro-Neto, Autran Macêdo, Luciano Bertini","doi":"10.1145/319463.319678","DOIUrl":"https://doi.org/10.1145/319463.319678","url":null,"abstract":"Multimedia systems such as video-on-demand (VOD) servers are time critical systems. These systems have strict response times, which implies that a delayed response can have serious consequence. For instance, in the case of a VOD server, an immediate consequence of a delayed response time can be user dissatisfaction, what can ultimately lead to the end of a business based on this system. Therefore, analysis and verification of timing properties of multimedia systems is an important problem. To verify if time critical systems satisfy their time bounds, we discuss the use of formal methods tools, in the verification and analysis of multimedia systems. We have used Verus (a formal verification tool) to model and analyze the ALMADEM-VOD server, a component of a true video-on-demand system. The modeling of this server in Verus has provided great insight into its design and its dynamic behavior. Using the quantitative estimates provided by Verus, we have determined performance bounds to the server. These bounds have pointed out that the performance curve of the actual server was almost at the predicted upper bound (worst case) level. These curves have uncovered design inefficiencies. After optimizing the server, its performance has improved over 40%, showing how useful formal verification can be used successfully during the design of multimedia systems.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127631619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319649
S. Goose, Carsten Möller
{"title":"A 3D audio only interactive Web browser: using spatialization to convey hypermedia document structure","authors":"S. Goose, Carsten Möller","doi":"10.1145/319463.319649","DOIUrl":"https://doi.org/10.1145/319463.319649","url":null,"abstract":"Interactive audio browsers provide both sighted and visually impaired users with access to the WWW. In addition to the desktop PC, audio browsing technology can be deployed that enable users to browse the WWW using a telephone or while driving a car. This paper describes a new conceptual model of the HTML document structure and its mapping to a 3D audio space. Novel features are discussed that provide information such as: an audio structural survey of the HTML document; accurate positional audio feedback of the source and destination anchors when traversing both inter-and intra-document links; a linguistic progress indicator; the announcement of destination document meta-information as new links are encountered. These new features can improve both the user's comprehension of the HTML document structure and their orientation within it. These factors, in turn, can improve the effectiveness of the browsing experience.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131374210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319642
Xiaoming Liu, Yueting Zhuang, Yunhe Pan
{"title":"Video based human animation technique","authors":"Xiaoming Liu, Yueting Zhuang, Yunhe Pan","doi":"10.1145/319463.319642","DOIUrl":"https://doi.org/10.1145/319463.319642","url":null,"abstract":"Human animation is a challenging domain in computer animation. To aim at many shortcomings in conventional techniques, this paper proposes a new video based human animation technique. Given a clip of video, firstly human joints are tracked with the support of Kalman filter and morph-block based match in the image sequence. Then corresponding sequence of three-dimension (3D) human motion skeleton is constructed under the perspective projection using camera calibration and human anatomy knowledge. Finally a motion library is established automatically by annotating multiform motion attributes, which can be browsed and queried by the animator. This approach has the characteristic of rich source material, low computing cost, efficient production, and realistic animation result. We demonstrate it on several video clips of people doing full body movements, and visualize the results by re-animating a 3D human skeleton model.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132921007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319654
Shingo Uchihashi, J. Foote, Andreas Girgensohn, J. Boreczky
{"title":"Video Manga: generating semantically meaningful video summaries","authors":"Shingo Uchihashi, J. Foote, Andreas Girgensohn, J. Boreczky","doi":"10.1145/319463.319654","DOIUrl":"https://doi.org/10.1145/319463.319654","url":null,"abstract":"This paper presents methods for automatically creating pictorial video summaries that resemble comic books. The relative importance of video segments is computed from their length and novelty. Image and audio analysis is used to automatically detect and emphasize meaningful events. Based on this importance measure, we choose relevant keyframes. Selected keyframes are sized by importance, and then efficiently packed into a pictorial summary. We present a quantitative measure of how well a summary captures the salient events in a video, and show how it can be used to improve our summaries. The result is a compact and visually pleasing summary that captures semantically important events, and is suitable for printing or Web access. Such a summary can be further enhanced by including text captions derived from OCR or other methods. We describe how the automatically generated summaries are used to simplify access to a large collection of videos.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133908318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319627
W. Zeng, S. Lei
{"title":"Efficient frequency domain video scrambling for content access control","authors":"W. Zeng, S. Lei","doi":"10.1145/319463.319627","DOIUrl":"https://doi.org/10.1145/319463.319627","url":null,"abstract":"Multimedia data security is very important for multimedia commerce on the Internet such as video-on-demand and real-time video multicast. Traditional cryptographic algorithms for data security are often not fast enough to process the vast amount of data generated by the multimedia applications to meet the real-time constraints. This paper presents a joint encryption and compression framework in which video data are scrambled efficiently in the frequency domain by employing selective bit scrambling, block shuffling and block rotation of the transform coefficients and motion vectors. The new approach is very simple to implement, yet provides considerable level of security, has minimum adverse impact on the compression efficiency, and allows transparency, transcodability, and other content processing functionalities without accessing the cryptographic key.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129539674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '99Pub Date : 1999-10-30DOI: 10.1145/319463.319680
J. Giménez, V. Messerli, O. Figueiredo, B. Gennart, R. Hersch
{"title":"Computer-aided parallelization of continuous media applications: the 4D beating heart slice server","authors":"J. Giménez, V. Messerli, O. Figueiredo, B. Gennart, R. Hersch","doi":"10.1145/319463.319680","DOIUrl":"https://doi.org/10.1145/319463.319680","url":null,"abstract":"Parallel servers for I/O and compute intensive continuous media applications are difficult to develop. A server application comprises many threads located in different address spaces as well as files striped over multiple disks located on different computers. The present contribution describes the construction of a continuous media server, the 4D beating heart slice server, based on a computer-aided parallelization tool (CAP) and on a library of parallel file system components enabling the combination of pipelined parallel disk access and processing operations. Thanks to CAP, the presented architecture is concisely described as a set of threads, operations located within the threads and flow of data and parameters (tokens) between operations. Continuous media applications are supported by allowing tokens to be suspended during a period of time specified by a user-defined function. Our target application, the 4D beating heart server supports the extraction of freely oriented slices from a 4D beating heart volume (one 3D volume per time sample). This server application requires both a high I/O throughput for accessing from disks the set of 4D sub-volumes (extents) intersecting the desired slices and a large amount of processing power to extract these slices and to resample them into the display grid. With a server configuration of 3 PCs and 24 disks, up to 7.3 slices can be delivered per second, i.e. 43 MB/s are continuously read from disks and 4.1 MB/s of slice parts are extracted, transfered to the client, merged, buffered and displayed. This performance is close to the maximal performance deliverable by the underlying hardware. The observed single stream server delay jitter varies between 0.6s (52% of maximal display rate) and 1.4s (92% of the maximal display rate). For the same resource utilization, the jitter is proportional to the number of streams that are accessed synchronously.\u0000 The presented 4D beating heart application suggests that powerful continuous media server applications can be built on top of a set of simple PCs connected to SCSI disks.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133610046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}