2005 IEEE International Conference on Multimedia and Expo最新文献

筛选
英文 中文
Speech-Based Visual Concept Learning Using Wordnet 基于语音的视觉概念学习
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521627
Xiaodan Song, Ching-Yung Lin, Ming-Ting Sun
{"title":"Speech-Based Visual Concept Learning Using Wordnet","authors":"Xiaodan Song, Ching-Yung Lin, Ming-Ting Sun","doi":"10.1109/ICME.2005.1521627","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521627","url":null,"abstract":"Modeling visual concepts using supervised or unsupervised machine learning approaches are becoming increasing important for video semantic indexing, retrieval, and filtering applications. Naturally, videos include multimodality data such as audio, speech, visual and text, which are combined to infer therein the overall semantic concepts. However, in the literature, most researches were conducted within only one single domain. In this paper we propose an unsupervised technique that builds context-independent keyword lists for desired visual concept modeling using WordNet. Furthermore, we propose an extended speech-based visual concept (ESVC) model to reorder and extend the above keyword lists by supervised learning based on multimodality annotation. Experimental results show that the context-independent models can achieve comparable performance compared to conventional supervised learning algorithms, and the ESVC model achieves about 53% and 28.4% improvement in two testing subsets of the TRECVID 2003 corpus over a state-of-the-art speech-based video concept detection algorithm","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132683726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reliable video communication with multi-path streaming using MDC 可靠的视频通信与多路径流使用MDC
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521522
I. Lee, L. Guan
{"title":"Reliable video communication with multi-path streaming using MDC","authors":"I. Lee, L. Guan","doi":"10.1109/ICME.2005.1521522","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521522","url":null,"abstract":"Video streaming demands high data rates and hard delay constraints, and it raises several challenges on today's packet-based and best-effort Internet. In this paper, we propose an efficient multiple-description coding (MDC) technique based on video frame sub-sampling and cubic-spline interpolation to provide spatial diversity, such that no additional buffering delay or storage is required. The frame dropping rate due to packet loss and drifting error under the multi-path streaming environment is analyzed in this paper.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132753187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Supporting rights checking in an MPEG-21 Digital Item Processing environment 支持MPEG-21数字项目处理环境中的权限检查
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521608
F. D. Keukelaere, T. DeMartini, Jeroen Bekaert, R. Walle
{"title":"Supporting rights checking in an MPEG-21 Digital Item Processing environment","authors":"F. D. Keukelaere, T. DeMartini, Jeroen Bekaert, R. Walle","doi":"10.1109/ICME.2005.1521608","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521608","url":null,"abstract":"Within the world of multimedia, the new MPEG-21 standard is currently under development. The purpose of this new standard is to create an open framework for multimedia delivery and consumption. MPEG-21 mastered the multitude of types of content and metadata by standardizing the declaration of digital items in an XML based format. In addition to standardizing, the declaration of digital items MPEG-21 also standardizes digital item processing, which enables the declaration of suggested uses of digital items. The rights expression language and the rights data dictionary parts of MPEG-21 enable the declaration of what rights (permitted interactions) Users are given to digital items. In this paper, we describe how rights checking can be realized in an environment in which interactions with digital items are declared through digital item processing. We demonstrate how rights checking can be done when \"critical\" digital item base operations are called and how rights context information can be gathered by tracking during the execution of digital item methods","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132781702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Evaluating keypoint methods for content-based copyright protection of digital images 基于内容的数字图像版权保护关键点方法评价
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521614
Larry Huston, R. Sukthankar, Yan Ke
{"title":"Evaluating keypoint methods for content-based copyright protection of digital images","authors":"Larry Huston, R. Sukthankar, Yan Ke","doi":"10.1109/ICME.2005.1521614","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521614","url":null,"abstract":"This paper evaluates the effectiveness of keypoint methods for content-based protection of digital images. These methods identify a set of \"distinctive\" regions (termed keypoints) in an image and encode them using descriptors that are robust to expected image transformations. To determine whether particular images were derived from a protected image, the keypoints for both images are generated and their descriptors matched. We describe a comprehensive set of experiments to examine how keypoint methods cope with three real-world challenges: (1) loss of keypoints due to cropping; (2) matching failures caused by approximate nearest-neighbor indexing schemes; (3) degraded descriptors due to significant image distortions. While keypoint methods perform very well in general, this paper identifies cases where the accuracy of such methods degrades.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132985470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Hybrid speaker tracking in an automated lecture room 自动演讲室的混合演讲者跟踪
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521365
Cha Zhang, Y. Rui, Li-wei He, M. Wallick
{"title":"Hybrid speaker tracking in an automated lecture room","authors":"Cha Zhang, Y. Rui, Li-wei He, M. Wallick","doi":"10.1109/ICME.2005.1521365","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521365","url":null,"abstract":"We present a hybrid speaker tracking scheme based on a single pan/tilt/zoom (PTZ) camera in an automated lecture capturing system. Given that the camera's video resolution is higher than the required output resolution, we frame the output video as a sub-region of the camera's input video. This allows us to track the speaker both digitally and mechanically. Digital tracking has the advantage of being smooth, and mechanical tracking can cover a wide area. The hybrid tracking achieves the benefits of both worlds. In addition to hybrid tracking, we present an intelligent pan/zoom selection scheme to improve the aestheticity of the lecture scene.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133259502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Proactive Energy Optimization Algorithms for Wavelet-Based Video Codecs on Power-Aware Processors 功率感知处理器上基于小波的视频编解码器的主动能量优化算法
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521486
V. Akella, M. Schaar, W. Kao
{"title":"Proactive Energy Optimization Algorithms for Wavelet-Based Video Codecs on Power-Aware Processors","authors":"V. Akella, M. Schaar, W. Kao","doi":"10.1109/ICME.2005.1521486","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521486","url":null,"abstract":"We propose a systematic technique for characterizing the workload of a video decoder at a given time and transforming the shape of the workload to optimize the utilization of a critical resource without compromising the distortion incurred in the process. We call our approach proactive resource management. We will illustrate our techniques by addressing the problem of minimizing the energy consumption during decoding a video sequence on a programmable processor that supports multiple voltages and frequencies. We evaluate two different heuristics for the underlying optimization problem that result in 50% to 92% improvements in energy savings compared to techniques that do not use dynamic adaptation","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131882778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
WA-TV: Webifying and Augmenting Broadcast Content for Next-Generation Storage TV WA-TV:下一代存储电视的网络和增强广播内容
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521716
H. Miyamori, Qiang Ma, Katsumi Tanaka
{"title":"WA-TV: Webifying and Augmenting Broadcast Content for Next-Generation Storage TV","authors":"H. Miyamori, Qiang Ma, Katsumi Tanaka","doi":"10.1109/ICME.2005.1521716","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521716","url":null,"abstract":"A method is proposed for viewing broadcast content that converts TV programs into Web content and integrates the results with complementary information retrieved using the Internet. Converting the programs into Web pages enables the programs to be skimmed over to get an overview and for particular scenes to be easily explored. Integrating complementary information enables the programs to be viewed efficiently with value-added content. An intuitive, user-friendly browsing interface enables the user to easily changing the level of detail displayed for the integrated information by zooming. Preliminary testing of a prototype system for next-generation storage TV, \"WA-TV\", validated the approach taken by the proposed method","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133093034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
What happens in films? 电影里发生了什么?
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521357
A. Salway, Andrew Vassiliou, K. Ahmad
{"title":"What happens in films?","authors":"A. Salway, Andrew Vassiliou, K. Ahmad","doi":"10.1109/ICME.2005.1521357","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521357","url":null,"abstract":"This paper aims to contribute to the analysis and description of semantic video content by investigating what actions are important in films. We apply a corpus analysis method to identify frequently occurring phrases in texts that describe films-screenplays and audio description. Frequent words and statistically significant collocations of these words are identified in screenplays of 75 films and in audio description of 45 films. Phrases such as 'looks at', 'turns to', 'smiles at' and various collocations of 'door' were found to be common. We argue that these phrases occur frequently because they describe actions that are important story-telling elements for filmed narrative. We discuss how this knowledge helps the development of systems to structure semantic video content.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114085062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Joint Image Halftoning and Watermarking in High-Resolution Digital Form 高分辨率数字形式的联合图像半色调和水印
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521478
Chao-Yong Hsu, Chun-Shien Lu
{"title":"Joint Image Halftoning and Watermarking in High-Resolution Digital Form","authors":"Chao-Yong Hsu, Chun-Shien Lu","doi":"10.1109/ICME.2005.1521478","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521478","url":null,"abstract":"The existing halftone image watermarking methods were proposed to embed a watermark bit in a halftone dot, which corresponds to a pixel, to generate stego halftone image. This one-to-one mapping, however, is not consistent with the one-to-many strategy that is used by current high-resolution devices, such as computer printers and screens, where one pixel is first expanded into many dots and then a halftoning processing is employed to generate a halftone image. Furthermore, electronic paper or smart paper that produces high-resolution digital files cannot be protected by the traditional halftone watermarking methods. In view of these facts, we present a high-resolution halftone watermarking scheme to deal with the aforementioned problems. The characteristics of our scheme include: (i) a high-resolution halftoning process that employs a one-to-many mapping strategy is proposed; (ii) a many-to-one inverse halftoning process is proposed to generate gray-scale images of good quality; and (iii) halftone image watermarking can be directly conducted on gray-scale instead of halftone images to achieve better robustness","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115900364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Video quality classification based home video segmentation 基于视频质量分类的家庭视频分割
2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521399
Si Wu, Yu-Fei Ma, HongJiang Zhang
{"title":"Video quality classification based home video segmentation","authors":"Si Wu, Yu-Fei Ma, HongJiang Zhang","doi":"10.1109/ICME.2005.1521399","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521399","url":null,"abstract":"Home videos often have some abnormal camera motions, such as camera shaking and irregular camera motions, which cause the degradation of visual quality. To remove bad quality segments and automatic stabilize shaky ones are necessary steps for home video archiving. In this paper, we proposed a novel segmentation algorithm for home video based on video quality classification. According to three important properties of motion, speed, direction, and acceleration, the effects caused by camera motion are classified into four categories: blurred, shaky, inconsistent and stable using support vector machines (SVMs). Based on the classification, a multi-scale sliding window is employed to parse video sequence into different segments along time axis, and each of these segments is labeled as one of camera motion effects. The effectiveness of the proposed approach has been validated by extensive experiments.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115139447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信