2017 IEEE International Conference on Multimedia and Expo (ICME)最新文献

筛选
英文 中文
QOE enhancement through cost-effective adaptation decision process for multiple-server streaming over HTTP 通过对HTTP上的多服务器流进行经济有效的适应决策过程来增强QOE
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019378
Joachim Bruneau-Queyreix, Mathias Lacaud, D. Négru, J. M. Batalla, E. Borcoci
{"title":"QOE enhancement through cost-effective adaptation decision process for multiple-server streaming over HTTP","authors":"Joachim Bruneau-Queyreix, Mathias Lacaud, D. Négru, J. M. Batalla, E. Borcoci","doi":"10.1109/ICME.2017.8019378","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019378","url":null,"abstract":"Single-source HTTP Adaptive Streaming protocols (HAS), such as MPEG-DASH, have become the de-facto solutions to deliver video over the Internet. By avoiding buffer stalling events that are mainly caused by the lack of throughput at client or at server side, HAS protocols increase end-user's Quality of Experience (QoE). We propose to extend HAS capabilities to a pragmatic DASH-compliant Multiple-Source Streaming solution (MS-Stream) that simultaneously utilizes several servers. MS-Stream offers the opportunity to obtain higher QoE by exploiting expanded bandwidth and link diversity in heterogeneous distributed streaming infrastructures, such as distributed home-gateways or geographically distributed set-top-boxes belonging to Over-The-Top video service providers. This paper exposes a cost-effective two-phase adaptation process with dual (i.e., bitrate and number of sources) adaptation decisioning prior segment request and in-segment download adaptation. Our approach was empirically evaluated for on-demand video streaming over the Internet. An online demonstration is also made available [1].","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122118240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Efficient image sensor noise estimation via iterative re-weighted least squares 基于迭代加权最小二乘的图像传感器噪声估计
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019427
Li Dong, Jiantao Zhou, Guangtao Zhai
{"title":"Efficient image sensor noise estimation via iterative re-weighted least squares","authors":"Li Dong, Jiantao Zhou, Guangtao Zhai","doi":"10.1109/ICME.2017.8019427","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019427","url":null,"abstract":"Noise estimation is crucial in many image processing algorithms such as image denoising. Conventionally, the noise is assumed as signal-independent additive white Gaussian process. However, for the real raw-data of imaging sensors, the present noise is better modeled as signal-dependent noise. In this work, we propose an efficient image sensor noise estimation method based on iterative re-weighted least squares optimization. Specifically, the image patches are first clustered into different groups, each of which will generate a data sample. To fit those observations robustly, we introduce a weighting matrix to reflect the credibility of each sample. Unfortunately, this setting of weighting matrix in turn depends on the unknown noise parameters. We then develop an iterative re-weighted least squares optimization procedure, in which the weighting matrix and parameter estimates can be updated alternately. Experimental results show that our method outperforms the state-of-the-art works, in terms of both estimation accuracy and computational efficiency.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123032579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Deep learning for multimodal-based video interestingness prediction 基于多模态视频兴趣预测的深度学习
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019300
Yuesong Shen, C. Demarty, Ngoc Q. K. Duong
{"title":"Deep learning for multimodal-based video interestingness prediction","authors":"Yuesong Shen, C. Demarty, Ngoc Q. K. Duong","doi":"10.1109/ICME.2017.8019300","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019300","url":null,"abstract":"Predicting interestingness of media content remains an important, but challenging research subject. The difficulty comes first from the fact that, besides being a high-level semantic concept, interestingness is highly subjective and its global definition has not been agreed yet. This paper presents the use of up-to-date deep learning techniques for solving the task. We perform experiments with both social-driven (i.e., Flickr videos) and content-driven (i.e., videos from the MediaEval 2016 interestingness task) datasets. To account for the temporal aspect and multimodality of videos, we tested various deep neural network (DNN) architectures, including a new combination of several recurrent neural networks (RNNs), to handle several temporal samples at the same time. We then investigated different strategies for dealing with unbalanced datasets. Multimodality, as the mid-level fusion of audio and visual information, brought benefit to the task. We also established that social interestingness differs from content interestingness.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123203891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Impact of video resolution changes on QoE for adaptive video streaming 视频分辨率变化对自适应视频流QoE的影响
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019297
Avsar Asan, W. Robitza, I. Mkwawa, Lingfen Sun, E. Ifeachor, A. Raake
{"title":"Impact of video resolution changes on QoE for adaptive video streaming","authors":"Avsar Asan, W. Robitza, I. Mkwawa, Lingfen Sun, E. Ifeachor, A. Raake","doi":"10.1109/ICME.2017.8019297","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019297","url":null,"abstract":"HTTP adaptive streaming (HAS) has become the de-facto standard for video streaming to ensure continuous multimedia service delivery under irregularly changing network conditions. Many studies already investigated the detrimental impact of various playback characteristics on the Quality of Experience of end users, such as initial loading, stalling or quality variations. However, dedicated studies tackling the impact of resolution adaptation are still missing. This paper presents the results of an immersive audiovisual quality assessment test comprising 84 test sequences from four different video content types, emulated with an HAS adaptation mechanism. We employed a novel approach based on systematic creation of adaptivity conditions which were assigned to source sequences based on their spatio-temporal characteristics. Our experiment investigates the resolution switch effect with respect to the degradations in MOS for certain adaptation patterns. We further demonstrate that the content type and resolution change patterns have a significant impact on the perception of resolution changes. These findings will help develop better QoE models and adaptation mechanisms for HAS systems in the future.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127508684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Recognition and retrieval of sound events using sparse coding convolutional neural network 基于稀疏编码卷积神经网络的声音事件识别与检索
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019552
Chien-Yao Wang, A. Santoso, S. Mathulaprangsan, Chin-Chin Chiang, Chung-Hsien Wu, Jia-Ching Wang
{"title":"Recognition and retrieval of sound events using sparse coding convolutional neural network","authors":"Chien-Yao Wang, A. Santoso, S. Mathulaprangsan, Chin-Chin Chiang, Chung-Hsien Wu, Jia-Ching Wang","doi":"10.1109/ICME.2017.8019552","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019552","url":null,"abstract":"This paper proposes a novel deep convolutional neural network (CNN), called sparse coding convolutional neural network (SC-CNN), to address the problem of sound event recognition and retrieval task. Unlike the general framework of a CNN, in which feature learning process is performed hierarchically, the proposed framework models the whole memorizing procedures in the human brain, including encoding, storage, and recollection. Sound data from the RWCP sound scene dataset with added noise from NOISEX-92 noise dataset are used to compare the performance of the proposed system with the state-of-the-art baselines. The experimental results indicated that the proposed SC-CNN outperformed the state-of-the-art systems in sound event recognition and retrieval. In the sound event recognition task, the proposed system achieved an accuracy of 94.6%, 100% and 100% under 0db, 10db and clean noise conditions, respectively. In the retrieval task, the proposed system improves the mAP rate of the general CNN by approximately 6%.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129543427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Cross-media retrieval with semantics clustering and enhancement 基于语义聚类和增强的跨媒体检索
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019310
Minfeng Zhan, L. Li, Qingming Huang, Yugui Liu
{"title":"Cross-media retrieval with semantics clustering and enhancement","authors":"Minfeng Zhan, L. Li, Qingming Huang, Yugui Liu","doi":"10.1109/ICME.2017.8019310","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019310","url":null,"abstract":"Cross-media retrieval, which uses a text query to search for images and vice-versa, has attracted a wide attention in recent years. The mostly existing cross-media retrieval methods aim at finding a common subspace and maximizing different modalities correlations. But these approaches do not directly capture the underlying semantic information of different modalities. This paper proposes a novel cross-media retrieval by semantics clustering and enhancement, where a semantic-preserved mapping is learned from the original space to the target semantic space. Meanwhile, In order to improve the demarcation of semantic space, we enhance the semantic manifold by learning a dimension invariant matrix. Our approach not only maximizes the correlation between different modalities, but also increases the discriminative ability among different categories. Experiments show that our approach outperforms the popular methods on two real word datasets.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129899701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
SynCam: Capturing sub-frame synchronous media using smartphones SynCam:使用智能手机捕获子帧同步媒体
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019430
Ishit Mehta, P. Sakurikar, R. Shah, P J Narayanan
{"title":"SynCam: Capturing sub-frame synchronous media using smartphones","authors":"Ishit Mehta, P. Sakurikar, R. Shah, P J Narayanan","doi":"10.1109/ICME.2017.8019430","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019430","url":null,"abstract":"Smartphones have become the de-facto capture devices for everyday photography. Unlike traditional digital cameras, smartphones are versatile devices with auxiliary sensors, processing power, and networking capabilities. In this work, we harness the communication capabilities of smartphones and present a synchronous/co-ordinated multi-camera capture system. Synchronous capture is important for many image/video fusion and 3D reconstruction applications. The proposed system provides an inexpensive and effective means to capture multi-camera media for such applications. Our coordinated capture system is based on a wireless protocol that uses NTP based synchronization and device specific lag compensation. It achieves sub-frame synchronization across all participating smartphones of even heterogeneous make and model. We propose a new method based on fiducial markers displayed on an LCD screen to temporally calibrate smart-phone cameras. We demonstrate the utility and versatility of this system to enhance traditional videography and to create novel visual representations such as panoramic videos, HDR videos, multi-view 3D reconstruction, multi-flash imaging, and multi-camera social media.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130881605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video salient object detection via cross-frame cellular automata 基于跨帧元胞自动机的视频显著目标检测
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019389
Jingfan Guo, Tongwei Ren, Lei Huang, Xingyu Liu, Ming-Ming Cheng, Gangshan Wu
{"title":"Video salient object detection via cross-frame cellular automata","authors":"Jingfan Guo, Tongwei Ren, Lei Huang, Xingyu Liu, Ming-Ming Cheng, Gangshan Wu","doi":"10.1109/ICME.2017.8019389","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019389","url":null,"abstract":"Salient object detection aims to detect the attractive objects on images and videos. In this paper, we propose a novel salient object detection method for videos based on cross-frame cellular automata. Given a video, we first represent the video frames with super-pixels, and construct a saliency propagation network among super-pixels within a frame and between adjacent frames based on their appearance similarities and temporal coherency. Second, we initialize the saliency map of each frame with the fusion of two saliency maps generated by appearance and motion features independently. Finally, we utilize cellular automata updating to propagate saliency among super-pixels iteratively and generate the coherent saliency maps with complete objects. The experimental results show that our method outperforms the state-of-the-art methods on different types of videos.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133654498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Deep multimodal network for multi-label classification 深度多模态网络的多标签分类
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019322
T. Chen, Shangfei Wang, Shiyu Chen
{"title":"Deep multimodal network for multi-label classification","authors":"T. Chen, Shangfei Wang, Shiyu Chen","doi":"10.1109/ICME.2017.8019322","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019322","url":null,"abstract":"Current multimodal deep learning approaches rarely explicitly exploit the dependencies inherent in multiple labels, which are crucial for multimodal multi-label classification. In this paper, we propose a multimodal deep learning approach for multi-label classification. Specifically, we introduce deep networks for feature representation learning and construct classifiers with the objective function which is constrained with dependencies among both labels and modals. We further propose effective training algorithm to learn deep networks and classifiers jointly. Thus, we explicitly leverage the relations among labels and modals to facilitate multimodal multi-label classification. Experiments of multi-label classification and cross-modal retrieval on the Pascal VOC dataset and the La-belMe dataset demonstrate the effectiveness of the proposed approach.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125802801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A deep convolutional neural network approach for complexity reduction on intra-mode HEVC 基于深度卷积神经网络的模式内HEVC复杂度降低方法
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019316
Tianyi Li, Mai Xu, Xin Deng
{"title":"A deep convolutional neural network approach for complexity reduction on intra-mode HEVC","authors":"Tianyi Li, Mai Xu, Xin Deng","doi":"10.1109/ICME.2017.8019316","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019316","url":null,"abstract":"The High Efficiency Video Coding (HEVC) standard significantly saves coding bit-rate over the proceeding H.264 standard, but at the expense of extremely high encoding complexity. In fact, the coding tree unit (CTU) partition consumes a large proportion of HEVC encoding complexity, due to the brute-force search for rate-distortion optimization (RDO). Therefore, we propose in this paper a complexity reduction approach for intra-mode HEVC, which learns a deep convolutional neural network (CNN) model to predict CTU partition instead of RDO. Firstly, we establish a large-scale database with diversiform patterns of CTU partition. Secondly, we model the partition as a three-level classification problem. Then, for solving the classification problem, we develop a deep CNN structure with various sizes of convolutional kernels and extensive trainable parameters, which can be learnt from the established database. Finally, experimental results show that our approach reduces intramode encoding time by 62.25% and 69.06% with negligible Bj⊘ntegaard delta bit-rate of 2.12% and 1.38%, over the test sequences and images respectively, superior to other state-of-the-art approaches.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121216356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信