2017 IEEE International Conference on Multimedia and Expo (ICME)最新文献

筛选
英文 中文
3D action recognition using data visualization and convolutional neural networks 使用数据可视化和卷积神经网络的三维动作识别
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019438
Mengyuan Liu, Chen Chen, Hong Liu
{"title":"3D action recognition using data visualization and convolutional neural networks","authors":"Mengyuan Liu, Chen Chen, Hong Liu","doi":"10.1109/ICME.2017.8019438","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019438","url":null,"abstract":"It remains a challenge to efficiently represent spatial-temporal data for 3D action recognition. To solve this problem, this paper presents a new skeleton-based action representation using data visualization and convolutional neural networks, which contains four main stages. First, skeletons from an action sequence are mapped as a set of five dimensional points, containing three dimensions of location, one dimension of time label and one dimension of joint label. Second, these points are encoded as a series of color images, by visualizing points as RGB pixels. Third, convolutional neural networks are adopted to extract deep features from color images. Finally, action class score is calculated by fusing selected deep features. Extensive experiments on three benchmark datasets show that our method achieves state-of-the-art results.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131880242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Hybrid color attribute compression for point cloud data 点云数据的混合颜色属性压缩
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019426
Li Cui, Haiyan Xu, E. Jang
{"title":"Hybrid color attribute compression for point cloud data","authors":"Li Cui, Haiyan Xu, E. Jang","doi":"10.1109/ICME.2017.8019426","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019426","url":null,"abstract":"This paper proposes a color attribute compression method for MPEG Point Cloud Compression (PCC) by exploiting the spatial redundancy among the adjacent points. With the increased interest in representing real-world surface as 3D point clouds, compressing the attributes (i.e., colors and normal directions) of point cloud has attracted great attention in MPEG. The proposed method is based on grouping the adjacent points in blocks. And two encoding modes are supported for each block, which include the run-length encoding mode and palette mode. The final encoding mode for each block is determined through comparing two distortion values based on two encoding modes. Experimental results show that the proposed approach achieves about 28 percent compression ratio than that of MPEG PCC.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115715261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Strummer: An interactive guitar chord practice system Strummer:一个互动的吉他和弦练习系统
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019338
Shunya Ariga, Masataka Goto, K. Yatani
{"title":"Strummer: An interactive guitar chord practice system","authors":"Shunya Ariga, Masataka Goto, K. Yatani","doi":"10.1109/ICME.2017.8019338","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019338","url":null,"abstract":"Musical instrument playing is a skill many people desire to acquire, and learners now have a wide variety of learning materials. However, their volume is enormous, and novice learners may easily get lost in which songs they should practice first. We develop Strummer: an interactive multimedial system for guitar practice. Strummer provide data-driven and personalized practice for learners in order to identify important and easy-to-learn chords and songs. This practice design is intended to encourage smooth skill transfers to songs that learners even have not seen. Our user study confirms the benefits and possible improvements of the Strummer system. In particular, participants expressed their positive impressions on lessons provided by the system.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114856722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
End-to-end learning for dimensional emotion recognition from physiological signals 基于生理信号的情感维度识别的端到端学习
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019533
Gil Keren, Tobias Kirschstein, E. Marchi, F. Ringeval, Björn Schuller
{"title":"End-to-end learning for dimensional emotion recognition from physiological signals","authors":"Gil Keren, Tobias Kirschstein, E. Marchi, F. Ringeval, Björn Schuller","doi":"10.1109/ICME.2017.8019533","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019533","url":null,"abstract":"Dimensional emotion recognition from physiological signals is a highly challenging task. Common methods rely on hand-crafted features that do not yet provide the performance necessary for real-life application. In this work, we exploit a series of convolutional and recurrent neural networks to predict affect from physiological signals, such as electrocardiogram and electrodermal activity, directly from the raw time representation. The motivation behind this so-called end-to-end approach is that, ultimately, the network learns an intermediate representation of the physiological signals that better suits the task at hand. Experimental evaluations show that, this very first study on end-to-end learning of emotion based on physiology, yields significantly better performance in comparison to existing work on the challenging RECOLA database, which includes fully spontaneous affective behaviors displayed during naturalistic interactions. Furthermore, we gain better understanding of the models' inner representations, by demonstrating that some cells' activations in the convolutional network are correlated to a large extent with hand-crafted features.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123274539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Learning informative pairwise joints with energy-based temporal pyramid for 3D action recognition 基于能量的时间金字塔学习信息配对关节,用于三维动作识别
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019313
Mengyuan Liu, Chen Chen, Hong Liu
{"title":"Learning informative pairwise joints with energy-based temporal pyramid for 3D action recognition","authors":"Mengyuan Liu, Chen Chen, Hong Liu","doi":"10.1109/ICME.2017.8019313","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019313","url":null,"abstract":"This paper presents an effective local spatial-temporal descriptor for action recognition from skeleton sequences. The unique property of our descriptor is that it takes the spatial-temporal discrimination and action speed variations into account, intending to solve the problems of distinguishing similar actions and identifying actions with different speeds in one goal. The entire algorithm consists of two stages. First, a frame selection method is used to remove noisy skeletons for a given skeleton sequence. From the selected skeletons, skeleton joints are mapped to a high dimensional space, where each point refers to kinematics, time label and joint label of a skeleton joint. To encode relative relationships among joints, pairwise points from the space are then jointly mapped to a new space, where each point encodes the relative relationships of skeleton joints. Second, Fisher Vector (FV) is employed to encode all points from the new space as a compact feature representation. To cope with speed variations in actions, an energy-based temporal pyramid is applied to form a multi-temporal FV representation, which is fed into a kernel-based extreme learning machine classifier for recognition. Extensive experiments on benchmark datasets consistently show that our method outperforms state-of-the-art approaches for skeleton-based action recognition.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113960075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
DenseTracker: A multi-task dense network for visual tracking DenseTracker:用于视觉跟踪的多任务密集网络
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019506
Fei Zhao, Ming Tang, Yi Wu, Jinqiao Wang
{"title":"DenseTracker: A multi-task dense network for visual tracking","authors":"Fei Zhao, Ming Tang, Yi Wu, Jinqiao Wang","doi":"10.1109/ICME.2017.8019506","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019506","url":null,"abstract":"How to track an arbitrary object in video is one of the main challenges in computer vision, and it has been studied for decades. Based on hand-crafted features, traditional trackers show poor discriminability for complex changes of object appearance. Recently, some trackers based on convolutional neural network (CNN) have shown some promising results by exploiting the rich convolutional features. In this paper, we propose a novel DenseTracker based on a mutli-task dense convolutional network. To learn a more compact and discriminative representation, we adopt a dense block structure to ensemble features from different layers. Then a multitask loss is designed to accurately predict the object position and scale by joint learning of box regression and pair-wise similarity. Furtherly, the DenseTracker is trained end-to-end on large-scale datasets including ImageNet Video (VID) and ALOV300++. The DenseTracker runs in 25 fps on GPU and achieves the state-of-the-art performance on two public benchmarks of OTB50 and VOT2016.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123828336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Image quality assessment for free viewpoint video based on mid-level contours feature 基于中级轮廓特征的免费视点视频图像质量评估
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019431
Suiyi Ling, P. Callet
{"title":"Image quality assessment for free viewpoint video based on mid-level contours feature","authors":"Suiyi Ling, P. Callet","doi":"10.1109/ICME.2017.8019431","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019431","url":null,"abstract":"Free view point video (FVV), which offers immersive experience to users with multiple views, is one of the new trends in advanced visual media. These new viewpoints are traditionally synthesized via depth image-based rendering(DIBR) and geometric distortions are therefore observed. Mid-level contours descriptors are capable of evaluating such edges incoherence among the synthesized images which common image quality metrics fail to capture. In this paper, we use the concept of ‘Sketch Token’, that is a mid-level contours descriptor, and introduce a novel metric for DIBR-synthesized image quality assessment by measuring how classes of contours change after synthesis. Experiments are conducted on the IRCCyN/IVC DIBR image database and the results show that the proposed metric achieves a correlation of 88.77% which is comparable to state-of-the-art metrics like MW-PSNR and MP-PSNR.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125890773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Semantic-aware adaptation scheme for soccer video over MPEG-DASH 基于MPEG-DASH的足球视频语义感知自适应方案
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019541
Shenghong Hu, Lingfen Sun, Chunxia Xiao, Chao Gui
{"title":"Semantic-aware adaptation scheme for soccer video over MPEG-DASH","authors":"Shenghong Hu, Lingfen Sun, Chunxia Xiao, Chao Gui","doi":"10.1109/ICME.2017.8019541","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019541","url":null,"abstract":"In recent years, quality of experience (QoE) has been investigated and proved to have both influential factors on user's visual quality and perceptual quality, while the perceptual quality means user's requirement on personalized content should be acquired in optimized quality. That's to say, those segments holding user interested content such as highlights need to be allocated more network resource in a resource-limited streaming scenario. However, all the existing HTTP-based adaptive methods only focus the content-agnostic bitrate adaptation according to limited network resources or energy resource, since they ignored user perceived semantics on some important segments, which suffered less quality on the important segments than on those ordinary ones, so as to hurt the overall QoE. In this paper, we have proposed a new semantic-aware adaptation scheme for MPEG-DASH services, which decides how to preserve bandwidth and buffering time depending on content descriptors for the perceived important content to users. Further, a semantic-aware probe and adaptation (SMA-PANDA) algorithm has been implemented in a DASH client to compare with conventional bitrate adaptions. Preliminary results show that SMA-PANDA achieves better QoE and flexibility on streaming user's interested content on MPEG-DASH platform, and it also aggressively helps user interested content compete more resource to deliver high quality presentation.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130028711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Multiscale dictionary learning for hierarchical sparse representation 分层稀疏表示的多尺度字典学习
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019435
Yangmei Shen, H. Xiong, Wenrui Dai
{"title":"Multiscale dictionary learning for hierarchical sparse representation","authors":"Yangmei Shen, H. Xiong, Wenrui Dai","doi":"10.1109/ICME.2017.8019435","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019435","url":null,"abstract":"In this paper, we propose a multiscale dictionary learning framework for hierarchical sparse representation of natural images. The proposed framework leverages an adaptive quadtree decomposition to represent structured sparsity in different scales. In dictionary learning, a tree-structured regularized optimization is formulated to distinguish and represent high-frequency details based on varying local statistics and group low-frequency components for local smoothness and structural consistency. In comparison to traditional proximal gradient method, block-coordinate descent is adopted to improve the efficiency of dictionary learning with a guarantee of recovery performance. The proposed framework enables hierarchical sparse representation by naturally organizing the trained dictionary atoms in a prespecified arborescent structure with descending scales from root to leaves. Consequently, the approximation of high-frequency details can be improved with progressive refinement from coarser to finer scales. Employed into image denoising, the proposed framework is demonstrated to be competitive with the state-of-the-art methods in terms of objective and visual restoration quality.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129693453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An interactive system for low-poly illustration generation from images using adaptive thinning 一个交互式系统,用于从使用自适应细化的图像生成低多边形插图
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019369
Yiting Ma, X. Chen, Yueling Bai
{"title":"An interactive system for low-poly illustration generation from images using adaptive thinning","authors":"Yiting Ma, X. Chen, Yueling Bai","doi":"10.1109/ICME.2017.8019369","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019369","url":null,"abstract":"Low-poly style illustrations, which have 3D abstract appearance, have become a popular stylish recently. Most previous methods require special knowledges in 3D modeling and need tedious interactions. We present an interactive system for non-expert users to easily manipulate the low-poly style illustration. Our system consists of two parts: vertex sampling and mesh rendering. In the vertex sampling stage, we extract a set of candidate points from the image and rank them according to their importance of structure preserving using adaptive thinning. Based on the pre-ranked point list, the user can select an arbitrary number of vertices for the triangle mesh construction. In the mesh rendering stage, we optimize triangle colors to create stereo-looking low-polys. We also provide three tools for exible modication of vertex numbers, color contrast, and local region emphasis. The experiment results demonstrate that our system outperforms state-of-the-art method via simple user interactions.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128434043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信