2015 IEEE International Conference on Multimedia and Expo (ICME)最新文献

筛选
英文 中文
Fusion of Time-of-Flight and Phase Shifting for high-resolution and low-latency depth sensing 融合飞行时间和相移的高分辨率和低延迟深度传感
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177426
Yueyi Zhang, Zhiwei Xiong, Feng Wu
{"title":"Fusion of Time-of-Flight and Phase Shifting for high-resolution and low-latency depth sensing","authors":"Yueyi Zhang, Zhiwei Xiong, Feng Wu","doi":"10.1109/ICME.2015.7177426","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177426","url":null,"abstract":"Depth sensors based on Time-of-Flight (ToF) and Phase Shifting (PS) have complementary strengths and weaknesses. ToF can provide real-time depth but limited in resolution and sensitive to noise. PS can generate accurate and robust depth with high resolution but requires a number of patterns that leads to high latency. In this paper, we propose a novel fusion framework to take advantages of both ToF and PS. The basic idea is using the coarse depth from ToF to disambiguate the wrapped depth from PS. Specifically, we address two key technical problems: cross-modal calibration and interference-free synchronization between ToF and PS sensors. Experiments demonstrate that the proposed method generates accurate and robust depth with high resolution and low latency, which is beneficial to tremendous applications.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133643387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Joint Latent Dirichlet Allocation for non-iid social tags 非id社会标签的联合潜狄利克雷分配
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177490
Jiangchao Yao, Ya Zhang, Zhe Xu, Jun-wei Sun, Jun Zhou, Xiao Gu
{"title":"Joint Latent Dirichlet Allocation for non-iid social tags","authors":"Jiangchao Yao, Ya Zhang, Zhe Xu, Jun-wei Sun, Jun Zhou, Xiao Gu","doi":"10.1109/ICME.2015.7177490","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177490","url":null,"abstract":"Topic models have been widely used for analyzing text corpora and achieved great success in applications including content organization and information retrieval. However, different from traditional text data, social tags in the web containers are usually of small amounts, unordered, and non-iid, i.e., it is highly dependent on contextual information such as users and objects. Considering the specific characteristics of social tags, we here introduce a new model named Joint Latent Dirichlet Allocation (JLDA) to capture the relationships among users, objects, and tags. The model assumes that the latent topics of users and those of objects jointly influence the generation of tags. The latent distributions is then inferred with Gibbs sampling. Experiments on two social tag data sets have demonstrated that the model achieves a lower predictive error and generates more reasonable topics. We also present an interesting application of this model to object recommendation.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116939995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Energy and area efficient hardware implementation of 4K Main-10 HEVC decoder in Ultra-HD Blu-ray player and TV systems 超高清蓝光播放器和电视系统中4K Main-10 HEVC解码器的节能和高效硬件实现
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177399
Tsu-Ming Liu, Yung-Chang Chang, Chih-Ming Wang, Hue-Min Lin, Chia-Yun Cheng, Chun-Chia Chen, Min-Hao Chiu, Sheng-Jen Wang, P. Chao, Meng-Jye Hu, Fu-Chun Yeh, Shun-Hsiang Chuang, Hsiu-Yi Lin, Ming-Long Wu, Che-Hong Chen, Chia-Lin Ho, Chi-Cheng Ju
{"title":"Energy and area efficient hardware implementation of 4K Main-10 HEVC decoder in Ultra-HD Blu-ray player and TV systems","authors":"Tsu-Ming Liu, Yung-Chang Chang, Chih-Ming Wang, Hue-Min Lin, Chia-Yun Cheng, Chun-Chia Chen, Min-Hao Chiu, Sheng-Jen Wang, P. Chao, Meng-Jye Hu, Fu-Chun Yeh, Shun-Hsiang Chuang, Hsiu-Yi Lin, Ming-Long Wu, Che-Hong Chen, Chia-Lin Ho, Chi-Cheng Ju","doi":"10.1109/ICME.2015.7177399","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177399","url":null,"abstract":"A 4K and Main-10 HEVC video decoder LSI is fabricated in a 28nm CMOS process. It adopts a block-concealed processor (BcP) to improve the visual quality and a bandwidth-suppressed processor (BsP) is newly designed to reduce 30% and 45% of external data accesses in playback and gaming scenario, respectively. It features fully core scalable (FCS) architecture which lowers the required working frequency by 65%. A 10-bit compact scheme is proposed to reduce the frame buffer space by 37.5%. Moreover, a multi-standard architecture reduces are by 28%. It achieves 530Mpixels/s throughput which is two times larger than the state-of-the-art HEVC design [2] and consumes 0.2nJ/pixel energy efficiency, enabling real-time 4K video playback in Ultra-HD Blu-ray player and TV systems.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133050584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multi-graph multi-instance learning with soft label consistency for object-based image retrieval 基于目标图像检索的软标签一致性多图多实例学习
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177391
Fei Li, Rujie Liu
{"title":"Multi-graph multi-instance learning with soft label consistency for object-based image retrieval","authors":"Fei Li, Rujie Liu","doi":"10.1109/ICME.2015.7177391","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177391","url":null,"abstract":"Object-based image retrieval has been an active research topic in the last decade, in which a user is only interested in some object instead of the whole image. As a promising approach, graph-based multi-instance learning has been paid much attention. Early retrieval methods often conduct learning on one graph in either image or region level. To further improve the performance, some recent methods adopt multi-graph learning, but the relationship between image- and region-level information is not well explored. In this paper, by constructing both image- and region-level graphs, a novel multi-graph multi-instance learning method is proposed. Different from the existing methods, the relationship between each labeled image and its segmented regions is reflected by the consistency of their corresponding soft labels, and it is formulated by the mutual restrictions in an optimization framework. A comprehensive cost function is designed to involve all the available information, and an iterative solution is introduced to solve the problem. Experimental results on the benchmark data set demonstrate the effectiveness of our proposal.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"285 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122973803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Instructive video retrieval for surgical skill coaching using attribute learning 基于属性学习的外科技术指导视频检索
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177389
Lin Chen, Qiang Zhang, Peng Zhang, Baoxin Li
{"title":"Instructive video retrieval for surgical skill coaching using attribute learning","authors":"Lin Chen, Qiang Zhang, Peng Zhang, Baoxin Li","doi":"10.1109/ICME.2015.7177389","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177389","url":null,"abstract":"Video-based coaching systems have seen increasing adoption in various applications including dance, sports, and surgery training. Most existing systems are either passive (for data capture only) or barely active (with limited automated feedback to a trainee). In this paper, we present a video-based skill coaching system for simulation-based surgical training by exploring a newly proposed problem of instructive video retrieval. By introducing attribute learning into video for high-level skill understanding, we aim at providing automated feedback and providing an instructive video, to which the trainees can refer for performance improvement. This is achieved by ensuring the feedback is weakness-specific, skill-superior and content-similar. A suite of techniques was integrated to build the coaching system with these features. In particular, algorithms were developed for action segmentation, video attribute learning, and attribute-based video retrieval. Experiments with realistic surgical videos demonstrate the feasibility of the proposed method and suggest areas for further improvement.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133160300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Image retrieval based on compressed camera sensor fingerprints 基于压缩相机传感器指纹的图像检索
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177454
D. Valsesia, G. Coluccia, T. Bianchi, E. Magli
{"title":"Image retrieval based on compressed camera sensor fingerprints","authors":"D. Valsesia, G. Coluccia, T. Bianchi, E. Magli","doi":"10.1109/ICME.2015.7177454","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177454","url":null,"abstract":"Image retrieval is the process of finding images from a large collection, satisfying a user-specified criterion. Content-based retrieval has been the traditional paradigm, in which one wishes to find images whose content is similar to a query. In this paper we explore a novel criterion for image search, based on forensic principles. We address the problem of retrieving all the photos in a collection that have been acquired by a specific device which is presented to the system as a query. This is an important forensic problem, whose solution could be very useful for detecting improper usage of pictures. We do not rely on metadata such as Exif headers because they can be unavailable, or easily manipulated, and in most cases cannot identify the specific device. We rely instead on a forensic tool called Photo Response Non-Uniformity (PRNU), which constitutes a reliable fingerprint of a camera sensor. We examine recent advances in compression of such fingerprints, which allow to address the previously unexplored image retrieval problem on large scales.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128269022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Probabilistic learning from mislabelled data for multimedia content recognition 从误标数据中进行概率学习,实现多媒体内容识别
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177393
Pravin Kakar, A. Chia
{"title":"Probabilistic learning from mislabelled data for multimedia content recognition","authors":"Pravin Kakar, A. Chia","doi":"10.1109/ICME.2015.7177393","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177393","url":null,"abstract":"There have been considerable advances in multimedia recognition recently as powerful computing capabilities and large, representative datasets become ubiquitous. A fundamental assumption of traditional recognition techniques is that the data available for training are accurately labelled. Given the scale and diversity of web data, it takes considerable annotation effort to reduce label noise to acceptable levels. In this work, we propose a novel method to work around this issue by utilizing approximate apriori estimates of the mislabelling probabilities to design a noise-aware learning framework. We demonstrate the proposed framework's effectiveness on several datasets of various modalities and show that it is able to achieve high levels of accuracy even when faced with significant mislabelling in the data.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126059344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Visualizing video sounds with sound word animation 可视化视频声音与声音文字动画
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177422
Fangzhou Wang, H. Nagano, K. Kashino, T. Igarashi
{"title":"Visualizing video sounds with sound word animation","authors":"Fangzhou Wang, H. Nagano, K. Kashino, T. Igarashi","doi":"10.1109/ICME.2015.7177422","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177422","url":null,"abstract":"Text captions are important means to provide sound information in videos when the sound is not accessible. However, conventional text captions are far less expressive for non-verbal sounds since they are designed to visualize speech sound. To address this problem, we propose a method for automatically transforming non-verbal video sounds to animated sound words, and positioning them near the sound source objects in the video for visualization. This provides natural visual representation of non-verbal sounds with rich information about the sound category and dynamics. We conducted a user study with over 300 participants using an online crowdsourcing service. The results showed that animated sound words could not only effectively and naturally visualize the dynamics of sound while clarify the position of the sound source, but also contribute to making video watching more enjoyable and increasing the visual impact of the video.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117039964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Harmonic Change Detection for musical chords segmentation 和声分割中的谐波变化检测
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177404
Alessio Degani, M. Dalai, R. Leonardi, P. Migliorati
{"title":"Harmonic Change Detection for musical chords segmentation","authors":"Alessio Degani, M. Dalai, R. Leonardi, P. Migliorati","doi":"10.1109/ICME.2015.7177404","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177404","url":null,"abstract":"In this paper, different strategies for the calculation of the Harte's Harmonic Change Detection Function (HCDF) are discussed. HCDFs can be used for detecting chord boundaries for Automatic Chord Estimation (ACE) tasks, where the chord transitions are identified as peaks in the HCDF. We show that different audio features and different novelty metric have significant impact on the overall accuracy results of a chord segmentation algorithm. Furthermore, we show that certain combination of audio features and novelty measures provide a significant improvement with respect to the current chord segmentation algorithms.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132644180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Robust interactive image segmentation with weak supervision for mobile touch screen devices 基于弱监督的移动触摸屏鲁棒交互式图像分割
2015 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177395
T. Wang, Huiling Wang, Lixin Fan
{"title":"Robust interactive image segmentation with weak supervision for mobile touch screen devices","authors":"T. Wang, Huiling Wang, Lixin Fan","doi":"10.1109/ICME.2015.7177395","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177395","url":null,"abstract":"In this paper, we present a robust and efficient approach for segmenting images with less and intuitive user interaction, particularly targeted for mobile touch screen devices. Our approach combines geodesic distance information with the flexibility of level set methods in energy minimization, leveraging the complementary strengths of each to promote accurate boundary placement and strong region connectivity while requiring less user interaction. To maximize the user-provided prior knowledge, we further propose a weakly supervised seed generation algorithm which enables image object segmentation without user-provided background seeds. Our approach provides a practical solution for visual object cutout on mobile touch screen devices, facilitating various media manipulation applications. We describe such a use case to selectively create oil painting effects on images. We demonstrate that our approach is less sensitive to seed placement and better at edge localization, whilst requiring less user interaction, compared with the state-of-the-art methods.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131593468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信