2006 IEEE International Conference on Multimedia and Expo最新文献

筛选
英文 中文
A Multimedia System for Route Sharing and Video-Based Navigation 一种多媒体路由共享与视频导航系统
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262553
Wen Wu, Jie Yang, Jing Zhang
{"title":"A Multimedia System for Route Sharing and Video-Based Navigation","authors":"Wen Wu, Jie Yang, Jing Zhang","doi":"10.1109/ICME.2006.262553","DOIUrl":"https://doi.org/10.1109/ICME.2006.262553","url":null,"abstract":"Trip planning and in-vehicle navigation are crucial tasks for easier and safer driving. The existing navigation systems are based on machine intelligence without allowing human knowledge incorporation. These systems give turn guidance with abstract visual instruction and have not reached the potential of minimizing driver's cognitive load, which is the amount of mental processing power required. In this paper, we describe the development of a multimedia system that makes driving and navigation safer and easier by offering tools for route sharing in trip planning and video-based route guidance during driving. The system provides a multimodal interface for a user to share his/her route with others by drawing on a digital map, naturally incorporating human knowledge into the trip planning process. The system gives driving instructions by overlaying navigational arrows onto live video and providing synthesized voice to reduce the driver's cognitive load, in addition to presenting landmark images for key maneuvers. We describe our observations which had motivated the development of the system, detailed architecture and user interfaces, and finally discusses our initial test findings in the real-road driving context","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127458748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Optimization of Matching Pursuit Encoder Based on Analytical Approximation of Matching Pursuit Distortion 基于匹配跟踪失真解析逼近的匹配跟踪编码器优化
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262407
A. Shoa, S. Shirani
{"title":"Optimization of Matching Pursuit Encoder Based on Analytical Approximation of Matching Pursuit Distortion","authors":"A. Shoa, S. Shirani","doi":"10.1109/ICME.2006.262407","DOIUrl":"https://doi.org/10.1109/ICME.2006.262407","url":null,"abstract":"Distortion of matching pursuit is calculated in terms of matching pursuit encoder parameters for uniformly distributed signals and dictionaries. Then, the MP encoder is optimized using the analytically derived approximation for MP distortion. Our simulation results show that this optimized MP encoder exhibits optimum performance for nonuniform signal and dictionary distributions as well","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124941681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Consistent Goal-Directed User Model for Realisitc Man-Machine Task-Oriented Spoken Dialogue Simulation 面向现实人机任务的口语对话模拟一致目标导向用户模型
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262563
O. Pietquin
{"title":"Consistent Goal-Directed User Model for Realisitc Man-Machine Task-Oriented Spoken Dialogue Simulation","authors":"O. Pietquin","doi":"10.1109/ICME.2006.262563","DOIUrl":"https://doi.org/10.1109/ICME.2006.262563","url":null,"abstract":"Because of the great variability of factors to take into account, designing a spoken dialogue system is still a tailoring task. Rapid design and reusability of previous work is made very difficult. For these reasons, the application of machine learning methods to dialogue strategy optimization has become a leading subject of researches this last decade. Yet, techniques such as reinforcement learning are very demanding in training data while obtaining a substantial amount of data in the particular case of spoken dialogues is time-consuming and therefore expansive. In order to expand existing data sets, dialogue simulation techniques are becoming a standard solution. In this paper we describe a user modeling technique for realistic simulation of man-machine goal-directed spoken dialogues. This model, based on a stochastic description of man-machine communication, unlike previously proposed models, is consistent along the interaction according to its history and a predefined user goal","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125867080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
SVM-Based Shot Boundary Detection with a Novel Feature 基于支持向量机的镜头边界检测
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262911
Kazunori Matsumoto, Masaki Naito, K. Hoashi, F. Sugaya
{"title":"SVM-Based Shot Boundary Detection with a Novel Feature","authors":"Kazunori Matsumoto, Masaki Naito, K. Hoashi, F. Sugaya","doi":"10.1109/ICME.2006.262911","DOIUrl":"https://doi.org/10.1109/ICME.2006.262911","url":null,"abstract":"This paper describes our new algorithm for shot boundary detection and its evaluation. We adopt a 2-stage data fusion approach with SVM technique to decide whether a boundary exists or not within a given video sequence. This approach is useful to avoid huge feature space problems, even when we adopt many promising features extracted from a video sequence. We also introduce a novel feature to improve detection. The feature consists of two kinds of values extracted from a local frame sequence. One is the image difference between the target frame and that synthesized from the neighbors. The other is the difference between neighbors. This feature can be extracted quickly with a least-square technique. Evaluation of our algorithm is conducted with the TRECVID evaluation framework. Our system obtained a high performance at a shot boundary detection task in TRECVID2005","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123690233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Nonlinearly-Adapted Lapped Transforms for Intra-Frame Coding 帧内编码的非线性自适应重叠变换
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262601
D. Lelescu
{"title":"Nonlinearly-Adapted Lapped Transforms for Intra-Frame Coding","authors":"D. Lelescu","doi":"10.1109/ICME.2006.262601","DOIUrl":"https://doi.org/10.1109/ICME.2006.262601","url":null,"abstract":"The use of block transforms for coding intra-frames in video coding may preclude higher coding performance due to residual correlation across block boundaries and insufficient energy compaction, which translates into unrealized rate-distortion gains. Subjectively, the occurrence of blocking artifacts is common. Post-filters and lapped transforms offer good solutions to these problems. Lapped transforms offer a more general framework which can incorporate coordinated pre- and post-filtering operations. Most common are fixed lapped transforms (such as lapped orthogonal transforms), and also transforms with adaptive basis function length. In contrast, in this paper we determine a lapped transform that non-linearly adapts its basis functions to local image statistics and the quantization regime. This transform was incorporated into the H.264/AVC codec, and its performance evaluated. As a result, significant rate-distortion gains of up to 0.45 dB (average 0.35dB) PSNR were obtained compared to the H.264/AVC codec alone","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114924711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Data Embedding in MPEG-1/Audio Layer II Compressed Domain using Side Information 基于边信息的MPEG-1/Audio Layer II压缩域数据嵌入
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262848
Akihiro Matsuoka, Kiyoshi Tanaka, A. Yoneyama, Y. Nakajima
{"title":"Data Embedding in MPEG-1/Audio Layer II Compressed Domain using Side Information","authors":"Akihiro Matsuoka, Kiyoshi Tanaka, A. Yoneyama, Y. Nakajima","doi":"10.1109/ICME.2006.262848","DOIUrl":"https://doi.org/10.1109/ICME.2006.262848","url":null,"abstract":"In this work, we propose a data embedding scheme in MPEG-1/audio layer II compressed domain. Data embedding is conducted every AAU by using side information (location of sub-band allocated audio signal) as a data carrier. In general, non-zero signals concentrates in low and middle frequency bands. Therefore we utilize sub-bands that are not allocated audio signal in high frequency bands to embed information. The proposed scheme can increase payload while achieving rewritable (reversible) data, embedding by choosing appropriate parameter. We verify the basic performance of our scheme through computer simulation by using some voice and music signals","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116070655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Person Tracking in Smart Rooms using Dynamic Programming and Adaptive Subspace Learning 基于动态规划和自适应子空间学习的智能房间人员跟踪
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262620
ZhenQiu Zhang, G. Potamianos, Stephen M. Chu, J. Tu, Thomas S. Huang
{"title":"Person Tracking in Smart Rooms using Dynamic Programming and Adaptive Subspace Learning","authors":"ZhenQiu Zhang, G. Potamianos, Stephen M. Chu, J. Tu, Thomas S. Huang","doi":"10.1109/ICME.2006.262620","DOIUrl":"https://doi.org/10.1109/ICME.2006.262620","url":null,"abstract":"We present a robust vision system for single person tracking inside a smart room using multiple synchronized, calibrated, stationary cameras. The system consists of two main components, namely initialization and tracking, assisted by an additional component that detects tracking drift. The main novelty lies in the adaptive tracking mechanism that is based on subspace learning of the tracked person appearance in selected two-dimensional camera views. The sub-space is learned on the fly, during tracking, but in contrast to the traditional literature approach, an additional \"forgetting\" mechanism is introduced, as a means to reduce drifting. The proposed algorithm replaces mean-shift tracking, previously employed in our work. By combining the proposed technique with a robust initialization component that is based on face detection and spatio-temporal dynamic programming, the resulting vision system significantly outperforms previously reported systems for the task of tracking the seminar presenter in data collected as part of the CHIL project","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116129058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
On the Detection of Multiplicative Watermarks for Speech Signals in the Wavelet and DCT Domains 小波和DCT域语音信号的乘性水印检测
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262793
R. Eslami, J. Deller, H. Radha
{"title":"On the Detection of Multiplicative Watermarks for Speech Signals in the Wavelet and DCT Domains","authors":"R. Eslami, J. Deller, H. Radha","doi":"10.1109/ICME.2006.262793","DOIUrl":"https://doi.org/10.1109/ICME.2006.262793","url":null,"abstract":"Blind multiplicative watermarking schemes for speech signals using wavelets and discrete cosine transform are presented. Watermarked signals are modeled using a generalized Gaussian distribution (GGD) and Cauchy probability model. Detectors are developed employing generalized likelihood ratio test (GLRT) and locally most powerful (LMP) approach. The LMP scheme is used for the Cauchy distribution, while the GLRT estimates the gain factor as an unknown parameter in the GGD model. The detectors are tested using Monte Carlo simulation and results show the superiority of the proposed LMP/Cauchy detector in some experiments","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122515274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Template-Based Semi-Automatic Profiling of Multimedia Applications 基于模板的多媒体应用程序半自动分析
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262717
C. Poucet, David Atienza Alonso, F. Catthoor
{"title":"Template-Based Semi-Automatic Profiling of Multimedia Applications","authors":"C. Poucet, David Atienza Alonso, F. Catthoor","doi":"10.1109/ICME.2006.262717","DOIUrl":"https://doi.org/10.1109/ICME.2006.262717","url":null,"abstract":"Modern multimedia applications possess a very dynamic use of the memory hierarchy depending on the actual input, therefore requiring run-time profiling techniques to enable optimizations. Because they can contain hundreds of thousands of lines of complex object-oriented specifications, this constitutes a tedious time-consuming task since the addition of profilecode is usually performed manually. In this paper, we present a high-level library-based approach for profiling both statically and dynamically defined variables using templates in C++. Our results in the visual texture coder of the MPEG4 standard show that using the information it provides, we can easily achieve 70.56% energy savings and 19.22% memory access reduction","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122448719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Image Content Clustering and Summarization for Photo Collections 图片集的图像内容聚类和摘要
2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262710
Cheng-Hung Li, Chih-Yi Chiu, Chun-Rong Huang, Chu-Song Chen, Lee-Feng Chien
{"title":"Image Content Clustering and Summarization for Photo Collections","authors":"Cheng-Hung Li, Chih-Yi Chiu, Chun-Rong Huang, Chu-Song Chen, Lee-Feng Chien","doi":"10.1109/ICME.2006.262710","DOIUrl":"https://doi.org/10.1109/ICME.2006.262710","url":null,"abstract":"Rapid growth of digital photography in recent years spurred the need of photo management tools. In this study, we propose an automatic organization framework for photo collections based on image content, so that a novel browsing experience is provided for users. For each photograph, human faces, together with corresponding clothes and nearby regions are located. We extract color histograms of these regions as the image content feature. Then a similarity matrix of a photo collection is generated according to temporal and content features of those photographs. We perform hierarchical clustering based on this matrix, and extract duplicate subjects of a cluster by introducing the contrast context histogram (CCH) technique. The experimental results show that the developed framework provides a promising result for photo management","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114381259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信