2018 IEEE International Conference on Multimedia and Expo (ICME)最新文献_第9页

Color Image Noise Covariance Estimation with Cross-Channel Image Noise Modeling 彩色图像噪声协方差估计与跨通道图像噪声建模

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486558

Li Dong, Jiantao Zhou, Tao Dai

引用次数: 8

Mural2Sketch: A Combined Line Drawing Generation Method for Ancient Mural Painting Mural2Sketch:一种用于古代壁画的组合线描生成方法

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486504

Di Sun, Jiawan Zhang, Gang Pan, Rui Zhan

引用次数: 4

Seethevoice: Learning from Music to Visual Storytelling of Shots Seethevoice:从音乐到镜头视觉叙事的学习

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486496

Wen-Li Wei, Jen-Chun Lin, Tyng-Luh Liu, Yi-Hsuan Yang, H. Wang, Hsiao-Rong Tyan, H. Liao

{"title":"Seethevoice: Learning from Music to Visual Storytelling of Shots","authors":"Wen-Li Wei, Jen-Chun Lin, Tyng-Luh Liu, Yi-Hsuan Yang, H. Wang, Hsiao-Rong Tyan, H. Liao","doi":"10.1109/ICME.2018.8486496","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486496","url":null,"abstract":"Types of shots in the language of film are considered the key elements used by a director for visual storytelling. In filming a musical performance, manipulating shots could stimulate desired effects such as manifesting the emotion or deepening the atmosphere. However, while the visual storytelling technique is often employed in creating professional recordings of a live concert, audience recordings of the same event often lack such sophisticated manipulations. Thus it would be useful to have a versatile system that can perform video mashup to create a refined video from such amateur clips. To this end, we propose to translate the music into a near-professional shot (type) sequence by learning the relation between music and visual storytelling of shots. The resulting shot sequence can then be used to better portray the visual storytelling of a song and guide the concert video mashup process. Our method introduces a novel probabilistic-based fusion approach, named as multi-resolution fused recurrent neural networks (MF-RNNs) with film-language, which integrates multi-resolution fused RNNs and a film-language model for boosting the translation performance. The results from objective and subjective experiments demonstrate that MF-RNNs with film-language can generate an appealing shot sequence with better viewing experience.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127767572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A Study on Multimodal Video Hyperlinking with Visual Aggregation 基于视觉聚合的多模态视频超链接研究

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/icme.2018.8486549

Mateusz Budnik, Mikail Demirdelen, G. Gravier

引用次数: 5

Edge Detection and Image Segmentation on Encrypted Image with Homomorphic Encryption and Garbled Circuit 基于同态加密和乱码电路的加密图像边缘检测与图像分割

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486551

Delin Chen, Wenhao Chen, Jian Chen, Peijia Zheng, Jiwu Huang

引用次数: 12

Dual Learning for Visual Question Generation 视觉问题生成的双重学习

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486475

Xing Xu, Jingkuan Song, Huimin Lu, Li He, Yang Yang, Fumin Shen

{"title":"Dual Learning for Visual Question Generation","authors":"Xing Xu, Jingkuan Song, Huimin Lu, Li He, Yang Yang, Fumin Shen","doi":"10.1109/ICME.2018.8486475","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486475","url":null,"abstract":"Recently, automatic answering of visually related questions (VQA) has gained a lot of attention in computer vision community. However, there is little work on automatically generating questions for images (VQG). Actually, VQG itself closes the loop to question-answering and diverse questions, which is useful to the research on VQA. Motivated by the assumption that learning to answer questions may boost the question generation, in this paper, we introduce the VQA task as the complementary of our primary VQG task, and propose a novel model that uses dual learning framework to jointly learn the dual tasks. In the framework, we devise an agent for VQG and VQA with pre-trained models respectively, and the learning tasks of the two agents form a closed loop, whose objectives are optimized together to guide each other via a reinforcement learning process. Specific rewards for each task are designed to update the models of the agents with policy gradient method. The relation of these two tasks can be exploited to further improve the performance of the primary VQG task. Extensive experiments conducted on two large-scale datasets show that the proposed method is capable to generate grounded visual questions of sufficient coverage and outperforms previous VQG methods on standard measures.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122288454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Consistency-Exclusivity Regularized Deep Metric Learning for General Kinship Verification 用于一般亲属关系验证的一致性-排他性正则化深度度量学习

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486590

Xiuzhuang Zhou, Zheng Zhang, Zeqiang Wei, Kai Jin, Min Xu

{"title":"Consistency-Exclusivity Regularized Deep Metric Learning for General Kinship Verification","authors":"Xiuzhuang Zhou, Zheng Zhang, Zeqiang Wei, Kai Jin, Min Xu","doi":"10.1109/ICME.2018.8486590","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486590","url":null,"abstract":"While encouraging results have been made so far to advance kinship verification by using facial images, learning a robust genetic similarity measure remains challenging, especially in the setting of general kinship verification, wherein the gender labels of the test samples are unknown in advance. In this paper we present a deep metric learning method with a carefully designed two-stream neural network to jointly learn a pair of deep embeddings for parent-child images. In particular, the deep embeddings are first modeled to explicitly consist of the common and individual components, and then two additional constraints are introduced in deep metric learning: 1) value-aware consistency on the common components, and 2) position-aware exclusivity on the individual components. The proposed hierarchical consistency-exclusivity regularization enables our deep metric learning to harness the sharable and complementary patterns inherent in parent-child images. Empirically, we show improved performance over state of the art metric learning solutions to general kinship verification on two benchmarks.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125642663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Sparse Representation for Color Image Based on Geometric Algebra 基于几何代数的彩色图像稀疏表示

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486524

Rui Wang, Yujie Wu, Miaomiao Shen, W. Cao

引用次数: 3

CUB360: Exploiting Cross-Users Behaviors for Viewport Prediction in 360 Video Adaptive Streaming CUB360:利用360视频自适应流媒体中的跨用户行为进行视口预测

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486606

Yixuan Ban, Lan Xie, Zhimin Xu, Xinggong Zhang, Zongming Guo, Yue Wang

{"title":"CUB360: Exploiting Cross-Users Behaviors for Viewport Prediction in 360 Video Adaptive Streaming","authors":"Yixuan Ban, Lan Xie, Zhimin Xu, Xinggong Zhang, Zongming Guo, Yue Wang","doi":"10.1109/ICME.2018.8486606","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486606","url":null,"abstract":"To ensure 360-degree video's continuous playback and reduce the bandwidth waste, predicting user's future fixation is indispensable. However, existing methods concentrate either on user's motion information or content information. None of them consider users watching behaviors' inconsistency which embodies user's attention distribution more explicitly. So in this paper, we exploit Cross-Users Behaviors for viewport prediction in 360-degree video adaptive streaming, namely CUB360, trying to concurrently consider user's personalized information and cross-users behaviors information to predict future viewport. Besides, we use a QoE-driven framework to optimize existing video streaming approaches and propose a general algorithm aiming at solving the NP problem at a low complexity. Extensive experimental results over real datasets demonstrate that compared with traditional adaptive streaming method, our proposal can significantly boost the prediction accuracy by 20.2% absolutely and 48.1 % relatively. Besides, the mean quality can get 30.28% gain while quality variance can be reduced by 29.89%.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132407404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 74

Enhanced Image Decoding via Edge-Preserving Generative Adversarial Networks 基于边缘保持生成对抗网络的增强图像解码

2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486495

Qi Mao, Shiqi Wang, Shanshe Wang, Xinfeng Zhang, Siwei Ma

{"title":"Enhanced Image Decoding via Edge-Preserving Generative Adversarial Networks","authors":"Qi Mao, Shiqi Wang, Shanshe Wang, Xinfeng Zhang, Siwei Ma","doi":"10.1109/ICME.2018.8486495","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486495","url":null,"abstract":"Lossy image compression usually introduces undesired compression artifacts, such as blocking, ringing and blurry effect{###} S, especially in low bit rate coding scenarios. Although many algorithms have been proposed to reduce these compression artifacts, most of them are based on image local smoothness prior, which usually leads to over-smoothing around the areas with distinct structures, e.g., edges and textures. In this paper, we propose a novel framework to enhance the perceptual quality of decoded images by well preserving the edge structures and predicting visually pleasing textures. Firstly, we propose an edge-preserving generative adversarial network (EP-GAN) to achieve edge restoration and texture generation simultaneously. Then, we elaborately design an edge fidelity regularization term to guide our network, which jointly utilizes the signal fidelity, feature fidelity and adversarial constraint to reconstruct high quality decoded images. Experimental results demonstrate that the proposed EP-GAN is able to efficiently enhance decoded images at low bit rate and reconstruct more perceptually pleasing images with abundant textures and sharp edges.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123665193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15