MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500212
C. Krasic, J. Walpole
{"title":"Priority-progress streaming for quality-adaptive multimedia","authors":"C. Krasic, J. Walpole","doi":"10.1145/500141.500212","DOIUrl":"https://doi.org/10.1145/500141.500212","url":null,"abstract":"The Internet's ubiquity amply motivates us to harness it for video distribution, however, its best-effort service model is in direct conflict with video's inherent timeliness requirements. Today, the Internet is unrivaled in its rich composition, consisting of an unparalleled assortment of networks and hosts. This richness is the result of an architecture that emphasizes interoperability over predictable performance. From the lowest levels, the Internet architecture prefers the best effort service model. We feel current solutions for media-streaming have yet to adequately address this conflict between timeliness and best-effort service.We propose that streaming-media solutions targetted at the Internet must fully embrace the notion of graceful degradation, they must be architected with the expectation that they operate within a continuum of service levels, adjusting quality-resource trade-offs as necessary to achieve timeliness requirements. In the context of the Internet, the continuum of service levels spans across a number oftime scales, ranging from sub-second timescales to timescales as long as months and years. We say sub-second timescales in relation to short-term dynamics such as network traffic and host workloads, while timescales of months and years relate to the continuous deployment of improving network, compute and storage infrastructure.We support our thesis with a proposal for a streaming model which we claim is simple enough to use end-to-end, yet expressive enough to tame the conflict between real-time and best-effort personalities of Internet streaming. The model is called Priority-Progress streaming. In this proposal, we will describe the main features of Priority-Progress streaming, which we have been implemented in a software-based streaming video system, called the Quasar pipeline.Our work is primarily concerned with the class of streaming applications. To prevent confusion, we now clarify the important distinction between streaming and other forms of distribution, namely download. For a video, we assume download is defined so that the transfer of the video must complete before the video is viewed. Transfer and viewing are temporally sequential. With this definition, it is a simple matter to employ Quality-adaptive video. One algorithm would be to deliver the entire video in the order from low to high quality components. The user may terminate the download early, and the incomplete video will automatically have as high quality as was possible. Thus, Quality-adaptive download can be implemented in an entirely best-effort, time-insensitive, fashion. On the other hand, we assume streaming means that the user views the video at the same time that the transfer occurs. Transfer and viewing are concurrent. There are timeliness requirements inherent in this definition, which can only be reconciled with best-effort delivery through a time-sensitive adaptive approach.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130972696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500262
J. Khan, S. S. Yang, Qiong Gu, D. Patel, Patrick Mail, Oleg V. Komogortsev, Wansik Oh, Z. Guo
{"title":"Resource adaptive netcentric systems: a case study with SONET - a self-organizing network embedded transcoder","authors":"J. Khan, S. S. Yang, Qiong Gu, D. Patel, Patrick Mail, Oleg V. Komogortsev, Wansik Oh, Z. Guo","doi":"10.1145/500141.500262","DOIUrl":"https://doi.org/10.1145/500141.500262","url":null,"abstract":"In this paper we discuss architecture for network aware adaptive systems for next generation networks. We present in the context of a novel cognizant video transcoding system, which is capable of negotiating local network state based rate and let the video propagate over extreme network with highly asymmetric link and node capacities utilizing knowlege about the network, content protocol and the content itself.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133197063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500233
Mohamad Obeid, B. Jedynak, M. Daoudi
{"title":"Image indexing & retrieval using intermediate features","authors":"Mohamad Obeid, B. Jedynak, M. Daoudi","doi":"10.1145/500141.500233","DOIUrl":"https://doi.org/10.1145/500141.500233","url":null,"abstract":"Visual information retrieval systems use low-level features such as color, texture and shape for image queries. Users usually have a more abstract notion of what will satisfy them. Using low-level features to correspond to high-level abstractions is one aspect of the semantic gap.In this paper, we introduce intermediate features. These are low-level \"semantic features\" and \"high level image\" features. That is, in one hand, they can be arranged to produce high level concept and in another hand, they can be learned from a small annotated database. These features can then be used in an image retrieval system.We report experiments where intermediate features are textures. These are learned from a small annotated database. The resulting indexing procedure is then demonstrated to be superior to a standard color histrogram indexing.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132595685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500242
Michel D. Bondy, N. Georganas, E. Petriu, D. Petriu, M. Cordea, T. Whalen
{"title":"Model-based face and lip animation for interactive virtual reality applications","authors":"Michel D. Bondy, N. Georganas, E. Petriu, D. Petriu, M. Cordea, T. Whalen","doi":"10.1145/500141.500242","DOIUrl":"https://doi.org/10.1145/500141.500242","url":null,"abstract":"In this paper, we describe an experimental performance-driven animation system for an avatar face using model-based video coding and audio-track driven lip animation.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127405072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500159
Simon Tong, E. Chang
{"title":"Support vector machine active learning for image retrieval","authors":"Simon Tong, E. Chang","doi":"10.1145/500141.500159","DOIUrl":"https://doi.org/10.1145/500141.500159","url":null,"abstract":"Relevance feedback is often a critical component when designing image databases. With these databases it is difficult to specify queries directly and explicitly. Relevance feedback interactively determinines a user's desired output or query concept by asking the user whether certain proposed images are relevant or not. For a relevance feedback algorithm to be effective, it must grasp a user's query concept accurately and quickly, while also only asking the user to label a small number of images. We propose the use of a support vector machine active learning algorithm for conducting effective relevance feedback for image retrieval. The algorithm selects the most informative images to query a user and quickly learns a boundary that separates the images that satisfy the user's query concept from the rest of the dataset. Experimental results show that our algorithm achieves significantly higher search accuracy than traditional query refinement schemes after just three to four rounds of relevance feedback.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"164 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127506985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500171
S. Pfeiffer
{"title":"Pause concepts for audio segmentation at different semantic levels","authors":"S. Pfeiffer","doi":"10.1145/500141.500171","DOIUrl":"https://doi.org/10.1145/500141.500171","url":null,"abstract":"This paper presents work on the determination of temporal audio segmentations at different semantic levels. The segmentation algorithm draws upon the calculation of relative silences or pauses. A perceptual loudness measure is the only feature employed. An adaptive threshold is used for classification into pause and non-pause. The segmentation algorithm that determines perceptually relevant pause intervals for different semantic levels incorporates a minimum duration and a maximum interruption constraint. The influence of the different parameters on the segmentation is examined in experiments and presented in this paper. A new approach for evaluating segmentation accuracies is required. It is shown that the simple perceptual pause concept has a very high relevance when segmenting audio at different semantic levels.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"163 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127518477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500204
Jiann-Min Ho, Jia-Cheng Hu, P. Steenkiste
{"title":"A conference gateway supporting interoperability between SIP and H.323","authors":"Jiann-Min Ho, Jia-Cheng Hu, P. Steenkiste","doi":"10.1145/500141.500204","DOIUrl":"https://doi.org/10.1145/500141.500204","url":null,"abstract":"Increased network bandwidth is making desktop video conferencing an attractive application for an increasing number of computer users. Unfortunately, two competing standards for video conferencing signaling are in use, H.323 and SIP. In this paper we look at the interoperability between these two standards by developing a conferencing gateway that supports conferences involving both SIP and H.323 clients. By appropriately translating between H.323 and SIP operations, our prototype gateway supports basic multi-party video conferencing between NetMeeting (an H.323 client) and VIC (a SIP client) without modifications to the clients. However, our experiments also show that seamless interoperation would require changes to the client implementations and the standards.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114961276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500173
Lie Lu, Hao Jiang, HongJiang Zhang
{"title":"A robust audio classification and segmentation method","authors":"Lie Lu, Hao Jiang, HongJiang Zhang","doi":"10.1145/500141.500173","DOIUrl":"https://doi.org/10.1145/500141.500173","url":null,"abstract":"In this paper, we present a robust algorithm for audio classification that is capable of segmenting and classifying an audio stream into speech, music, environment sound and silence. Audio classification is processed in two steps, which makes it suitable for different applications. The first step of the classification is speech and non-speech discrimination. In this step, a novel algorithm based on KNN and LSP VQ is presented. The second step further divides non-speech class into music, environment sounds and silence with a rule based classification scheme. Some new features such as the noise frame ratio and band periodicity are introduced and discussed in detail. Our experiments in the context of video structure parsing have shown the algorithms produce very satisfactory results.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128744341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MULTIMEDIA '01Pub Date : 2001-10-01DOI: 10.1145/500141.500252
M. V. Setten, E. Oltmans
{"title":"Demonstration of a distributed MPEG-7 video search and retrieval application in the educational domain","authors":"M. V. Setten, E. Oltmans","doi":"10.1145/500141.500252","DOIUrl":"https://doi.org/10.1145/500141.500252","url":null,"abstract":"This demonstration shows the end-user application of the Video-over-IP (VIP) system. This system encompasses a whole chain of processes for digital video databases, ranging from distributed content production to acquire MPEG-7 metadata (including speech-and video analysis) to the deployment and access to the video database by end users. This system has been developed as an application for the next generation Internet. The end-user application provides distributed search engines, tools to browse and analyze videos, and playlist functionalities. The tools have been developed using a user-centered design approach, to assure usability for the students and teachers in the pilot project.","PeriodicalId":416848,"journal":{"name":"MULTIMEDIA '01","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130591125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}