2017 IEEE International Conference on Multimedia and Expo (ICME)最新文献

筛选
英文 中文
A joint deep-network-based image restoration algorithm for multi-degradations 基于深度网络的多重退化图像恢复联合算法
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019361
Xu Sun, Xiaoguang Li, L. Zhuo, K. Lam, Jiafeng Li
{"title":"A joint deep-network-based image restoration algorithm for multi-degradations","authors":"Xu Sun, Xiaoguang Li, L. Zhuo, K. Lam, Jiafeng Li","doi":"10.1109/ICME.2017.8019361","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019361","url":null,"abstract":"In the procedures of image acquisition, compression, and transmission, captured images usually suffer from various degradations, such as low-resolution and compression distortion. Although there have been a lot of research done on image restoration, they usually aim to deal with a single degraded factor, ignoring the correlation of different degradations. To establish a restoration framework for multiple degradations, a joint deep-network-based image restoration algorithm is proposed in this paper. The proposed convolutional neural network is composed of two stages. Firstly, a de-blocking subnet is constructed, using two cascaded neural network. Then, super-resolution is carried out by a 20-layer very deep network with skipping links. Cascading these two stages forms a novel deep network. Experimental results on the Set5, Setl4 and BSD100 benchmarks demonstrate that the proposed method can achieve better results, in terms of both the subjective and objective performances.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129431014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Network-assisted strategy for dash over CCN 跨越CCN的网络辅助策略
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019482
Rihab Jmal, G. Simon, L. Chaari
{"title":"Network-assisted strategy for dash over CCN","authors":"Rihab Jmal, G. Simon, L. Chaari","doi":"10.1109/ICME.2017.8019482","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019482","url":null,"abstract":"MPEG Dynamic Adaptive Streaming over HTTP (DASH) has become the most used technology of video delivery nowadays. Considering the video segment more important than its location, new internet architecture such as Content Centric Network (CCN) is proposed to enhance DASH streaming. This architecture with its in-network caching salient feature improves Quality of Experience (QoE) from consumer side. It reduces delays and increases throughput by providing the requested video segment from a near point to the end user. However, there are oscillations issues induced by caching with DASH. In this paper, we propose a new Network-Assisted Strategy (NAS) based-on traffic shaping and request prediction with the aim of improving DASH flows investigating new internet architecture CCN.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129898117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
HEVC-EPIC: Edge-preserving interpolation of coded HEVC motion with applications to framerate upsampling HEVC- epic:边缘保持插值编码HEVC运动与应用程序的帧率上采样
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019515
Dominic Rüfenacht, D. Taubman
{"title":"HEVC-EPIC: Edge-preserving interpolation of coded HEVC motion with applications to framerate upsampling","authors":"Dominic Rüfenacht, D. Taubman","doi":"10.1109/ICME.2017.8019515","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019515","url":null,"abstract":"We propose a method to obtain a high quality motion field from decoded HEVC motion. We use the block motion vectors to establish a sparse set of correspondences, and then employ an affine, edge-preserving interpolation of correspondences (EPIC) to obtain a dense optical flow. Experimental results on a variety of sequences coded at a range of QP values show that the proposed HEVC-EPIC is over five times as fast as the original EPIC flow, which uses a sophisticated correspondence estimator, while only slightly decreasing the flow accuracy. The proposed work opens the door to leveraging HEVC motion into video enhancement and analysis methods. To provide some evidence of what can be achieved, we show that when used as input to a framerate upsampling scheme, the average Y-PSNR of the interpolated frames obtained using HEVC-EPIC motion is slightly lower (0.2dB) than when original EPIC flow is used, with hardly any visible differences.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128703151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fine-grained image recognition via weakly supervised click data guided bilinear CNN model 基于弱监督点击数据引导的双线性CNN模型的细粒度图像识别
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019407
Guangjian Zheng, Min Tan, Jun Yu, Qing Wu, Jianping Fan
{"title":"Fine-grained image recognition via weakly supervised click data guided bilinear CNN model","authors":"Guangjian Zheng, Min Tan, Jun Yu, Qing Wu, Jianping Fan","doi":"10.1109/ICME.2017.8019407","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019407","url":null,"abstract":"Bilinear convolutional neural networks (BCNN) model, the state-of-the-art in fine-grained image recognition, fails in distinguishing the categories with subtle visual differences. We design a novel BCNN model guided by user click data (C-BCNN) to improve the performance via capturing both the visual and semantical content in images. Specially, to deal with the heavy noise in large-scale click data, we propose a weakly supervised learning approach to learn the C-BCNN, namely W-C-BCNN. It can automatically weight the training images based on their reliability. Extensive experiments are conducted on the public Clickture-Dog dataset. It shows that: (1) integrating CNN with click feature largely improves the performance; (2) both the click data and visual consistency can help to model image reliability. Moreover, the method can be easily customized to medical image recognition. Our model performs much better than conventional BCNN models on both the Clickture-Dog and medical image dataset.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129570529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Reduced reference stereoscopic image quality assessment based on entropy of classified primitives 基于分类基元熵的缩减参考立体图像质量评估
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019337
Zhaolin Wan, Feng Qi, Yutao Liu, Debin Zhao
{"title":"Reduced reference stereoscopic image quality assessment based on entropy of classified primitives","authors":"Zhaolin Wan, Feng Qi, Yutao Liu, Debin Zhao","doi":"10.1109/ICME.2017.8019337","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019337","url":null,"abstract":"Stereoscopic vision is a complex system which receives and integrates perceptual information from both monocular and binocular cues. In this paper, a novel reduced-reference stereoscopic image quality assessment scheme is proposed, based on the visual perceptual information measured by entropy of classified primitives (EoCP) and mutual information of classified primitives (MIoCP), named as DCprimary, sketch and texture primitives respectively, which is in accordance with the hierarchical progressive process of human visual perception. Specifically, EoCP of each-view image are calculated as monocular cue, and MIoCP between two-view images is derived as binocular cue. The Maximum (MAX) mechanism is applied to determine the perceptual information. The perceptual information differences between the original and distorted images are used to predict the stereoscopic image quality by support vector regression (SVR). Experimental results on LIVE phase II asymmetric database validate the proposed metric achieves significantly higher consistency with subjective ratings and outperforms state-of-the-art stereoscopic image quality assessment methods.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130599236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Time-ordered spatial-temporal interest points for human action classification 人类行为分类的时序时空兴趣点
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019477
Mengyuan Liu, Chen Chen, Hong Liu
{"title":"Time-ordered spatial-temporal interest points for human action classification","authors":"Mengyuan Liu, Chen Chen, Hong Liu","doi":"10.1109/ICME.2017.8019477","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019477","url":null,"abstract":"Human action classification, which is vital for content-based video retrieval and human-machine interaction, finds problem in distinguishing similar actions. Previous works typically detect spatial-temporal interest points (STIPs) from action sequences and then adopt bag-of-visual words (BoVW) model to describe actions as numerical statistics of STIPs. Despite the robustness of BoVW, this model ignores the spatial-temporal layout of STIPs, leading to misclassification among different types of actions with similar numerical statistics of STIPs. Motivated by this, a time-ordered feature is designed to describe the temporal distribution of STIPs, which contains complementary structural information to traditional BoVW model. Moreover, a temporal refinement method is used to eliminate intra-variations among time-ordered features caused by performers' habits. Then a time-ordered BoVW model is built to represent actions, which encodes both numerical statistics and temporal distribution of STIPs. Extensive experiments on three challenging datasets, i.e., KTH, Rochster and UT-Interaction, validate the effectiveness of our method in distinguishing similar actions.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131959069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Online low-rank similarity function learning with adaptive relative margin for cross-modal retrieval 基于自适应相对裕度的在线低秩相似函数学习
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019528
Yiling Wu, Shuhui Wang, W. Zhang, Qingming Huang
{"title":"Online low-rank similarity function learning with adaptive relative margin for cross-modal retrieval","authors":"Yiling Wu, Shuhui Wang, W. Zhang, Qingming Huang","doi":"10.1109/ICME.2017.8019528","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019528","url":null,"abstract":"This paper presents a Cross-Modal Online Low-Rank Similarity function learning method (CMOLRS) for cross-modal retrieval, which learns a low-rank bilinear similarity measure on data from different modalities. CMOLRS models the cross-modal relations by relative similarities on a set of training data triplets and formulates the relative relations as convex hinge loss functions. By adapting the margin of hinge loss using information from feature space and label space for each triplet, CMOLRS effectively captures the multi-level semantic correlation among cross-modal data. The similarity function is learned by online learning in the manifold of low-rank matrices, thus good scalability is gained when processing large scale datasets. Extensive experiments are conducted on three public datasets. Comparisons with the state-of-the-art methods show the effectiveness and efficiency of our approach.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131981903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Adaptive attention fusion network for visual question answering 视觉问答的自适应注意力融合网络
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019540
Geonmo Gu, S. T. Kim, Yong Man Ro
{"title":"Adaptive attention fusion network for visual question answering","authors":"Geonmo Gu, S. T. Kim, Yong Man Ro","doi":"10.1109/ICME.2017.8019540","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019540","url":null,"abstract":"Automatic understanding of the content of a reference image and natural language questions is needed in Visual Question Answering (VQA). Generating a visual attention map that focuses on the regions related to the context of the question can improve performance of VQA. In this paper, we propose adaptive attention-based VQA network. The proposed method utilizes the complementary information from the attention maps depending on three levels of word embedding (word level, phrase level, and question level embedding), and adaptively fuses the information to represent the image-question pair appropriately. Comparative experiments have been conducted on the public COCO-QA database to validate the proposed method. Experimental results have shown that the proposed method outperforms previous methods in terms of accuracy.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132147892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Learning to generate video object segment proposals 学习生成视频对象分段建议
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019535
Jianwu Li, Tianfei Zhou, Yao Lu
{"title":"Learning to generate video object segment proposals","authors":"Jianwu Li, Tianfei Zhou, Yao Lu","doi":"10.1109/ICME.2017.8019535","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019535","url":null,"abstract":"This paper proposes a fully automatic pipeline to generate accurate object segment proposals in realistic videos. Our approach first detects generic object proposals for all video frames and then learns to rank them using a Convolutional Neural Networks (CNN) descriptor built on appearance and motion cues. The ambiguity of the proposal set can be reduced while the quality can be retained as highly as possible Next, high-scoring proposals are greedily tracked over the entire sequence into distinct tracklets. Observing that the proposal tracklet set at this stage is noisy and redundant, we perform a tracklet selection scheme to suppress the highly overlapped tracklets, and detect occlusions based on appearance and location information. Finally, we exploit holistic appearance cues for refinement of video segment proposals to obtain pixel-accurate segmentation. Our method is evaluated on two video segmentation datasets i.e. SegTrack v1 and FBMS-59 and achieves competitive results in comparison with other state-of-the-art methods.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127901977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Non-negative dictionary learning with pairwise partial similarity constraint 基于两两部分相似度约束的非负字典学习
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-01 DOI: 10.1109/ICME.2017.8019392
Xu Zhou, Pak Lun Kevin Ding, Baoxin Li
{"title":"Non-negative dictionary learning with pairwise partial similarity constraint","authors":"Xu Zhou, Pak Lun Kevin Ding, Baoxin Li","doi":"10.1109/ICME.2017.8019392","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019392","url":null,"abstract":"Discriminative dictionary learning has been widely used in many applications such as face retrieval / recognition and image classification, where the labels of the training data are utilized to improve the discriminative power of the learned dictionary. This paper deals with a new problem of learning a dictionary for associating pairs of images in applications such as face image retrieval. Compared with a typical supervised learning task, in this case the labeling information is very limited (e.g. only some training pairs are known to be associated). Further, associated pairs may be considered similar only after excluding certain regions (e.g. sunglasses in a face image). We formulate a dictionary learning problem under these considerations and design an algorithm to solve the problem. We also provide a proof for the convergence of the algorithm. Experiments and results suggest that the proposed method is advantageous over common baselines.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123174964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信