2011 IEEE International Conference on Multimedia and Expo最新文献

筛选
英文 中文
Stroke-based creation of depth maps 基于描边的深度图创建
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6012006
M. Gerrits, B. Decker, Cosmin Ancuti, Tom Haber, C. Ancuti, T. Mertens, P. Bekaert
{"title":"Stroke-based creation of depth maps","authors":"M. Gerrits, B. Decker, Cosmin Ancuti, Tom Haber, C. Ancuti, T. Mertens, P. Bekaert","doi":"10.1109/ICME.2011.6012006","DOIUrl":"https://doi.org/10.1109/ICME.2011.6012006","url":null,"abstract":"Depth information opens up a lot of possibilities for meaningful editing of photographs. So far, it has only been possible to acquire depth information by either using additional hardware, restrictive scene assumptions or extensive manual input. We developed a novel user-assisted technique for creating adequate depth maps with an intuitive stroke-based user interface. Starting from absolute depth constraints as well as surface normal constraints, we optimize for a feasible depth map over the image. We introduce a suitable smoothness constraint that respects image edges and accounts for slanted surfaces. We illustrate the usefulness of our technique by several applications such as depth of field reduction and advanced compositing.","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123022198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
LipActs: Efficient representations for visual speakers LipActs:视觉说话者的有效表示
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6012102
E. Zavesky
{"title":"LipActs: Efficient representations for visual speakers","authors":"E. Zavesky","doi":"10.1109/ICME.2011.6012102","DOIUrl":"https://doi.org/10.1109/ICME.2011.6012102","url":null,"abstract":"Video-based lip activity analysis has been successfully used for assisting speech recognition for almost a decade. Surprisingly, this same capability has not been heavily used for near real-time visual speaker retrieval and verification, due to tracking complexity, inadequate or difficult feature determination, and the need for a large amount of pre-labeled data for model training. This paper explores the performance of several solutions using modern histogram of oriented gradients (HOG) features, several quantization techniques, and analyzes the benefits of temporal sampling and spatial partitioning to derive a representation called LipActs. Two datasets are used for evaluation: one with 81 participants derived from varying quality YouTube content and one with 3 participants derived from a forward-facing mobile video camera with 10 varied lighting and capture angle environments. Over these datasets, LipActs with a moderate number of pooled temporal frames and multi-resolution spatial quantization, offer an improvement of 37–73% over raw features when optimizing for lowest equal error rate (EER).","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"41 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121710096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learn-pads: A mathematical exergaming system for children's physical and mental well-being 学习垫:一个数学练习系统,为儿童的身心健康
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6011852
Ali Karime, Hussein Al Osman, W. Gueaieb, J. Jaam, Abdulmotaleb El Saddik
{"title":"Learn-pads: A mathematical exergaming system for children's physical and mental well-being","authors":"Ali Karime, Hussein Al Osman, W. Gueaieb, J. Jaam, Abdulmotaleb El Saddik","doi":"10.1109/ICME.2011.6011852","DOIUrl":"https://doi.org/10.1109/ICME.2011.6011852","url":null,"abstract":"Child obesity is one of the major challenges facing modern societies, especially in developed countries. Exergaming tools are considered as effective means to reduce obesity among kids because they require the children to exert physical strength while playing the games. However, most of the existing exergaming tools focus more on the physical well-being of its users and almost neglect the mental aspect. In this paper, we present an exergaming system that combines both aspects by promoting not only entertainment, but also learning through physical activity. The system consists of a set of footpads that allow the user to interact with video games enriched with multimedia and aimed at enhancing the math knowledge of children. Our study shows that the system have created an atmosphere of fun among the children and engaged them in learning.","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122634070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Application-layer error resilience for wireless IP-based video broadcasting 基于ip的无线视频广播的应用层错误恢复能力
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6011938
Sheau-Ru Tong, Yuan-Tse Yu, C. Chen
{"title":"Application-layer error resilience for wireless IP-based video broadcasting","authors":"Sheau-Ru Tong, Yuan-Tse Yu, C. Chen","doi":"10.1109/ICME.2011.6011938","DOIUrl":"https://doi.org/10.1109/ICME.2011.6011938","url":null,"abstract":"Wireless IP-based video broadcast suffers from heavy packet losses caused by multipath fading and interference variations in the wireless channels. This paper addresses this issue by proposing an application-layer error-resilience scheme, called the replicate multiple descriptor coding scheme (RMD). In principle, RMD extends the conventional multiple descriptor transmission strategy with two new features, the selective-frame-based replication and a time-shifted descriptor transmission. We show that with these two features, we are able to exploit the time diversity in a time-sharing channel to mitigate the damage impact and provide more efficient protection for the video. The simulation results confirm that when the packet loss rate is heavy (e.g., 15%–35%), RMD outperforms other schemes in terms of PSNR improved, while only requiring moderate data overheads.","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122838498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Commercial detection by mining maximal repeated sequence in audio stream 音频流中最大重复序列挖掘的商业检测
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6012115
Jiansong Chen, Teng Li, Lei Zhu, Peng Ding, Bo Xu
{"title":"Commercial detection by mining maximal repeated sequence in audio stream","authors":"Jiansong Chen, Teng Li, Lei Zhu, Peng Ding, Bo Xu","doi":"10.1109/ICME.2011.6012115","DOIUrl":"https://doi.org/10.1109/ICME.2011.6012115","url":null,"abstract":"Efficient detection of commercial is an important topic for many applications such as commercial monitoring, market investigation. This paper reports an unsupervised technique of discovering commercial by mining repeated sequence in audio stream. Compared with previous work, we focus on solving practical problems by introducing three principles of commercial: repetition principle, independence principle and equivalence principle. Based on these principles, we detect the commercials by first mining maximal repeated sequences (MRS) and then post-processing the MRS pairs based on independence principle and equivalence principle for final result. In addition, a coarse-to-fine scheme is adopted in the acoustic matching stage to save computational cost. Extensive experiments both on simulated data and real broadcast data demonstrate the effectiveness of our method.","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129619049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An efficient, robust video fingerprinting system 高效、稳健的视频指纹识别系统
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6012135
R. Cook
{"title":"An efficient, robust video fingerprinting system","authors":"R. Cook","doi":"10.1109/ICME.2011.6012135","DOIUrl":"https://doi.org/10.1109/ICME.2011.6012135","url":null,"abstract":"An efficient, robust system for machine identification of file and stream-based video content is presented. Efficiency is achieved through easily computed features, simple comparisons, and careful selection of robust indices that lead to fast searches. Robustness is achieved by selection of features that reflect the time structure of the content—a measure of how the visual content changes over time, perhaps the quintessential aspect of video. These features, primarily the overall luminance and interframe luminance differences, are unlikely to change as the underlying signal is distorted by typical video processing, both benevolent and otherwise. Feature extraction, indexing, and matching are discussed.","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129555268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Classified quadtree-based adaptive loop filter 分类四叉树自适应环路滤波器
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6012172
Qian Chen, Yunfei Zheng, P. Yin, X. Lu, J. Solé, Qian Xu, E. François, D. Wu
{"title":"Classified quadtree-based adaptive loop filter","authors":"Qian Chen, Yunfei Zheng, P. Yin, X. Lu, J. Solé, Qian Xu, E. François, D. Wu","doi":"10.1109/ICME.2011.6012172","DOIUrl":"https://doi.org/10.1109/ICME.2011.6012172","url":null,"abstract":"In this paper, we propose a classified quadtree-based adaptive loop filter (CQALF) in video coding. Pixels in a picture are classified into two categories by considering the impact of the deblocking filter, the pixels that are modified and the pixels that are not modified by the deblocking filter. A wiener filter is carefully designed for each category and the filter coefficients are transmitted to decoder. For the pixels that are modified by the deblocking filter, the filter is estimated at encoder by minimizing the mean square error between the original input frame and a combined frame which is a weighted average of the reconstructed frames before and after the deblocking filter. For pixels that the deblocking filter does not modify, the filter is estimated by minimizing the mean square error between the original frame and the reconstructed frame. The proposed algorithm is implemented on top of KTA software and compatible with the quadtree-based adaptive loop filter. Compared with kta2.6r1 anchor, the proposed CQALF achieves 10.05%, 7.55%, and 6.19% BD bitrate reduction in average for intra only, IPPP, and HB coding structures respectively.","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128455081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Photo identity tag suggestion using only social network context on large-scale web services 照片身份标签建议只使用社交网络背景下的大规模web服务
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6012061
Chi-Yao Tseng, Ming-Syan Chen
{"title":"Photo identity tag suggestion using only social network context on large-scale web services","authors":"Chi-Yao Tseng, Ming-Syan Chen","doi":"10.1109/ICME.2011.6012061","DOIUrl":"https://doi.org/10.1109/ICME.2011.6012061","url":null,"abstract":"Recently, uploading photos and adding identity tags on social network services are prevalent. Although some researchers have considered leveraging context to facilitate the process of tagging, these approaches still rely mainly on face recognition techniques that use visual features of photos. However, since the computational and storage costs of these approaches are generally high, they cannot be directly applicable to large-scale web services. To resolve this problem, we explore using only social network context to generate the top-k list of photo identity tag suggestion. The proposed method is based on various co-occurrence contexts that are related to the question of who may appear in this photo. An efficient ranking algorithm is designed to satisfy the real-time needs of this application. We utilize public album data of 400 volunteers from Facebook to verify that our approach can efficiently provide accurate suggestions with less additional storage requirement.","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130835451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Automatic voice disorder classification using vowel formants 使用元音共振峰的语音障碍自动分类
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6012187
Muhammad Ghulam, M. Alsulaiman, A. Mahmood, Z. Ali
{"title":"Automatic voice disorder classification using vowel formants","authors":"Muhammad Ghulam, M. Alsulaiman, A. Mahmood, Z. Ali","doi":"10.1109/ICME.2011.6012187","DOIUrl":"https://doi.org/10.1109/ICME.2011.6012187","url":null,"abstract":"In this paper, we propose an automatic voice disorder classification system using first two formants of vowels. Five types of voice disorder, namely, cyst, GERD, paralysis, polyp and sulcus, are used in the experiments. Spoken Arabic digits from the voice disordered people are recorded for input. First formant and second formant are extracted from the vowels [Fatha] and [Kasra], which are present in Arabic digits. These four features are then used to classify the voice disorder using two types of classification methods: vector quantization (VQ) and neural networks. In the experiments, neural network performs better than VQ. For female and male speakers, the classification rates are 67.86% and 52.5%, respectively, using neural networks. The best classification rate, which is 78.72%, is obtained for female sulcus disorder.","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130960836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Temporal-spatial face recognition using multi-atlas and Markov process model 基于多图谱和马尔可夫过程模型的时空人脸识别
2011 IEEE International Conference on Multimedia and Expo Pub Date : 2011-07-11 DOI: 10.1109/ICME.2011.6012063
Gaopeng Gou, Rui Shen, Yunhong Wang, A. Basu
{"title":"Temporal-spatial face recognition using multi-atlas and Markov process model","authors":"Gaopeng Gou, Rui Shen, Yunhong Wang, A. Basu","doi":"10.1109/ICME.2011.6012063","DOIUrl":"https://doi.org/10.1109/ICME.2011.6012063","url":null,"abstract":"Although video-based face recognition algorithms can provide more information than image-based algorithms, their performance is affected by subjects' head poses, expressions, illumination and so on. In this paper, we present an effective video-based face recognition algorithm. Multi-atlas is employed to efficiently represent faces of individual persons under various conditions, such as different poses and expressions. The Markov process model is used to propagate the temporal information between adjacent video frames. The combination of multi-atlas and Markov model provides robust face recognition by taking both spatial and temporal information into account. The performance of our algorithm was evaluated on three standard test databases: the Honda/UCSD video database, the CMU Motion of Body database, and the multi-modal VidTIMIT database. Experimental results demonstrate that our video-based face recognition algorithm outperforms other methods on all three test databases.","PeriodicalId":433997,"journal":{"name":"2011 IEEE International Conference on Multimedia and Expo","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121915024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信