MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500256
Bruno Emond, Martin Brooks, Arnold G. Smith
{"title":"A broadband web-based application for video sharing and annotation","authors":"Bruno Emond, Martin Brooks, Arnold G. Smith","doi":"10.1145/500141.500256","DOIUrl":"https://doi.org/10.1145/500141.500256","url":null,"abstract":"This demonstration reports on the current stage of development of a web-based environment to support video sharing and annotation. The initial requirements and their implementation are briefly presented. The application is aimed at supporting the professional development of teachers with a multimedia application over a broadband network. However, work with teachers has pointed us to new requirements, differentiating VSA from prrevious work. Continued work with teachers will lead to further evolution of VSA.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125345922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500215
R. S. Aygün, A. Zhang
{"title":"Middle-tier for multimedia synchronization","authors":"R. S. Aygün, A. Zhang","doi":"10.1145/500141.500215","DOIUrl":"https://doi.org/10.1145/500141.500215","url":null,"abstract":"The gap between the synchronization specification and the synchronization model limits user interactions for a multimedia presentation. The middle-tier for multimedia synchronization handles the synchronization rules that are directly extracted from the specification. In addition to these rules, the middle-tier also manages implicit synchronization rules which are not specified but can be extracted from other rules. The synchronization rules generated by the middle-tier assists the synchronization model to provide user interactions while keeping the synchronization specification minimal. We give examples of how these rules are generated from SMIL expressions.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126444923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500253
Jiang Li, Keman Yu, Gang Chen, Yong Wang, Hanning Zhou, Jizheng Xu, K. Ng, Kaibo Wang, Lijie Wang, H. Shum
{"title":"Portrait video phone","authors":"Jiang Li, Keman Yu, Gang Chen, Yong Wang, Hanning Zhou, Jizheng Xu, K. Ng, Kaibo Wang, Lijie Wang, H. Shum","doi":"10.1145/500141.500253","DOIUrl":"https://doi.org/10.1145/500141.500253","url":null,"abstract":"As the Internet and wirless networks are developed rapidly, the demand of communicating anywhere, anytime on any device emerges. However, most of the current wireless networks still work in low bandwidths, and mobile devices still suffer from weak computational power, short battery lifetime and limited display capability. We developed portrait video phone systems that can run on Pcs and Pocket Pcs at very low bit rates through the Internet. The core technology that portrait video phones employ is the so-called portrait video (or bi-level video) codec. Portrait video codec first converts a full-color video into a black/white image sequence and then compresses it into a black/white portrait-like video. Portrait video processes clearer shape, smoother motion, shorter initial latency, and cheaper computational cost than MPEG2, MPEG4 and H.263 for low bandwidths. Typically the portrait video phone provides QCIF-size video with a frame rate of 5-15 fps for a 9.6 Kbps video bandwidth. The portrait video is so small that it can even be transmitted through an HTTP proxy as text. Experiments show that the portrait video phones work well on ordinary GSM wireless telecommunication networks.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125885380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ubiquitous media agents for managing personal multimedia files","authors":"Wenyin Liu, Zheng Chen, Fan Lin, Rui Yang, Mingjing Li, HongJiang Zhang","doi":"10.1145/500141.500229","DOIUrl":"https://doi.org/10.1145/500141.500229","url":null,"abstract":"A novel idea of ubiquitous media agents is presented. Media agents are intelligent systems that are able to automatically collect and build personalized semantic indices of multimedia data on behalf of the user whenever and wherever he/she accesses/uses these multimedia data. The sources of these semantic descriptions are the textual context of the same documents that contain these multimedia data. The URLs of these multimedia data are indexed using these textual features. When the user wants to use these multimedia data once again, the media agents can also help the user find relevant multimedia data and provide proper suggestions based on the semantic indices. The media agents can also learn form the user's interaction records to refine the semantic indices and to model the user intentions and preferences. In our experiments, the media agents are effective in gathering relevant semantics for media objects and learning to provide precise suggestions when the user wants to re-use relevant media objects again.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128236145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500193
Andrew T. Wilson, Ketan Mayer-Patel, Dinesh Manocha
{"title":"Spatially-encoded far-field representations for interactive walkthroughs","authors":"Andrew T. Wilson, Ketan Mayer-Patel, Dinesh Manocha","doi":"10.1145/500141.500193","DOIUrl":"https://doi.org/10.1145/500141.500193","url":null,"abstract":"We introduce the notion of spatially encoded video and use it for efficiently representing image-based impostors for interactive walkthroughs. As part of a pre-process, we automatically decompose the model and compute the far-fields. The resulting texture images are organized along multiple dimensions and can be accessed in a user-steered order at interactive rates. Our encoding algorithm can compress the impostors size by two orders of magnitude. Furthermore, the storage cost for additional impostors or samples grows sub-linearly. The resulting system has been applied to a complex CAD environment composed of 13 million triangles. We are able to render it at interactive rates on a PC with little loss in image quality.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132790598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500153
Desmond Chambers, G. Lyons, J. Duggan
{"title":"Stream enhancements for the CORBA event service","authors":"Desmond Chambers, G. Lyons, J. Duggan","doi":"10.1145/500141.500153","DOIUrl":"https://doi.org/10.1145/500141.500153","url":null,"abstract":"This paper describes a number of enhancements for the standard CORBA Event Service. The basic service definition has been extended to support stream events, multimedia data flows, event fragmentation, quality of service definition, as well as multicast event delivery. The paper evaluates the service performance and describes experiences using the enhanced service in the development of a test application.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133933869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500180
F. Nack, W. Putz
{"title":"Designing annotation before it's needed","authors":"F. Nack, W. Putz","doi":"10.1145/500141.500180","DOIUrl":"https://doi.org/10.1145/500141.500180","url":null,"abstract":"This paper considers the automated and semi-automated annotation of audiovisual media in a new type of production framework, A4SM (Authoring System for Syntactic, Semantic and Semiotic Modelling). We present the architecture of the framework and outline the underlying XML-Schema based content description structures of A4SM. We then describe tools for a news and demonstrate how video material can be annotated in real time and how this information can not only be used for retrieval but also can be used during the different phases of the production process itself. Finally, we discuss the pros and cons of our approach of evolving semantic networks as the basis for audio- visual content description.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"15 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114006926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500236
N. Sgouros, S. Kousidou
{"title":"Authoring and execution environments for multimedia applications featuring robotic actors","authors":"N. Sgouros, S. Kousidou","doi":"10.1145/500141.500236","DOIUrl":"https://doi.org/10.1145/500141.500236","url":null,"abstract":"Recent advances in multimedia systems and robotics encourage the development of novel types of applications that associate the use of various multimedia objects with the behavior of multiple robotic actors. Unfortunately, the development of such systems is hampered by the lack of appropriate authoring and execution environments. This paper seeks to fill this gap by describing CHOROS, a Java-based authoring environment for visually planning the behavior of multiple robotic actors and linking it with the rendering state of various multimedia objects. This is accomplished with an augmented reality interface in which the author draws the robot paths and associates them with timelines describing the use of various multimedia objects. During the actual execution of the application the environment automatically tracks and adjusts the behavior of the robotic actors in order to maintain its association with the rendering state of the multimedia objects.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122641198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500222
P. Hong, Zhen Wen, Thomas S. Huang
{"title":"An integrated framework for face modeling, facial motion analysis and synthesis","authors":"P. Hong, Zhen Wen, Thomas S. Huang","doi":"10.1145/500141.500222","DOIUrl":"https://doi.org/10.1145/500141.500222","url":null,"abstract":"This paper presents an integrated framework for face modeling, facial motion analysis and synthesis. This framework systematically addresses three closely related research issues: (1) selecting a quantitative visual representation for face modeling and face animation; (2) automatic facial motion analysis based on the same visual representation; and (3) speech to facial coarticulation modeling. The framework provides a guideline for methodically building a face modeling and animation system. The systematicness of the framework is reflected by the links among its components, whose details are presented. Based on this framework, we improved a face modeling and animation system, called the iFACE system [4]. The final system provides functionalities for customizing a generic face model for an individual, text driven face animation, off-line speech driven face animation, and real-time speech driven face animation.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124162101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500231
Simon Moncrieff, C. Dorai, S. Venkatesh
{"title":"Affect computing in film through sound energy dynamics","authors":"Simon Moncrieff, C. Dorai, S. Venkatesh","doi":"10.1145/500141.500231","DOIUrl":"https://doi.org/10.1145/500141.500231","url":null,"abstract":"We develop an algorithm for the detection and classification of affective sound events underscored by specific patterns of sound energy dynamics. We relate the portrayal of these events to proposed high level affect or emotional coloring of the events. In this paper, four possible characteristic sound energy events are identified that convey well established meanings through their dynamics to portray and deliver certain affect, sentiment related to the horror film genre. Our algorithm is developed with the ultimate aim of automatically structuring sections of films that contain distinct shades of emotion related to horror themes for nonlinear media access and navigation. An average of 82% of the energy events, obtained from the analysis of the audio tracks of sections of four sample films corresponded correctly to the proposed affect. While the discrimination between certain sound energy event types was low, the algotithm correctly detected 71% of the occurrences of the sound energy events within audio tracks of the films analyzed, and thus forms a useful basis for determining affective scenes characteristic of horror in movies.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128590346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}