2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)最新文献_第5页

Video temporal super-resolution using nonlocal registration and self-similarity 基于非局部配准和自相似的视频时间超分辨率

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813400

Matteo Maggioni, P. Dragotti

{"title":"Video temporal super-resolution using nonlocal registration and self-similarity","authors":"Matteo Maggioni, P. Dragotti","doi":"10.1109/MMSP.2016.7813400","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813400","url":null,"abstract":"In this paper we present a novel temporal super-resolution method for increasing the frame-rate of single videos. The proposed algorithm is based on motion-compensated 3-D patches, i.e., a sequence of 2-D blocks following a given motion trajectory. The trajectories are computed through a coarse-to-fine motion estimation strategy embedding a regularized block-wise distance metric that takes into account the coherence of neighbouring motion vectors. Our algorithm comprises two stages. In the first stage, a nonlocal search procedure is used to find a set of 3-D patches (targets) similar to a given patch (reference), subsequently all targets are registered at sub-pixel precision with respect to the reference in an upsampled 3-D FFT domain, and finally all registered patches are aggregated at their appropriate locations in the high-resolution video. The second stage is used to further improve the estimation quality by correcting each 3-D patch of the video obtained from the first stage with a linear operator learned from the self-similarity of patches at a lower temporal scale. Our experimental evaluation on color videos shows that the proposed approach achieves high quality super-resolution results from both an objective and subjective point of view.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115293621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Affective states classification using EEG and semi-supervised deep learning approaches 基于EEG和半监督深度学习方法的情感状态分类

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813351

Haiyan Xu, K. Plataniotis

{"title":"Affective states classification using EEG and semi-supervised deep learning approaches","authors":"Haiyan Xu, K. Plataniotis","doi":"10.1109/MMSP.2016.7813351","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813351","url":null,"abstract":"Affective states of a user provide important information for many applications such as, personalized information (e.g., multimedia content) retrieval/delivery or intelligent human-computer interface design. In recently years, physiological signals, Electroencephalogram (EEG) in particular, have been shown to be very effective in estimating a user's affective states during social interaction or under video or audio stimuli. However, due to the large number of parameters associated with the neural expression of emotion, there is still a lot of unknowns on the specific spatial and spectral correlation of the EEG signal and the affective states expression. To investigate on such correlation, two types of semi-supervised deep learning approaches, stacked denoising autoencoder (SDAE) and deep belief networks (DBN), were applied as application specific feature extractors for the affective states classification problem using EEG signals. To evaluate the efficacy of the proposed semi-supervised approaches, a subject-specific affective states classification experiment were carried out on the DEAP database to classify 2-dimensional affect states. The DBN based model achieved averaged F1 scores of 86.67%, 86.60% and 86.69% for arousal, valence and liking states classification respectively, which has significantly improved the state-of-art classification performance. By examining the weight vectors at each layer, we were also able to gain insights on the spatial or spectral locations of the most discriminating features. Another main advantage of applying the semi-supervised learning methods is that only a small fraction of labeled data, e.g., 1/6 of the training samples, were used in this study.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123368898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 69

Detection of fake 3D video using CNN 利用CNN检测假3D视频

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813368

Shuvendu Rana, S. Gaj, A. Sur, P. Bora

引用次数: 7

Optimised selection of structure of pictures for video coding 优化了视频编码的图片结构选择

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813359

Vigneswaran Poobalasingam, E. Izquierdo, Saverio G. Blasi, M. Mrak

引用次数: 1

An embedded 3D geometry score for mobile 3D visual search 一个嵌入式3D几何分数的移动3D视觉搜索

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813366

Hanwei Wu, Haopeng Li, M. Flierl

引用次数: 0

Two-layer large-scale cover song identification system based on music structure segmentation 基于音乐结构分割的双层大型翻唱歌曲识别系统

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813372

Kang Cai, Deshun Yang, Xiaoou Chen

引用次数: 5

Layer-based temporal dependent rate-distortion optimization in Random-Access hierarchical video coding 随机访问分层视频编码中基于层的时变率失真优化

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813408

Yanbo Gao, Ce Zhu, Shuai Li, Tianwu Yang

{"title":"Layer-based temporal dependent rate-distortion optimization in Random-Access hierarchical video coding","authors":"Yanbo Gao, Ce Zhu, Shuai Li, Tianwu Yang","doi":"10.1109/MMSP.2016.7813408","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813408","url":null,"abstract":"Rate-distortion optimization (RDO) plays an important part in improving the coding efficiency of High Efficiency Video Coding (HEVC), especially for the hierarchical coding structure defined in the Random-Access (RA) configuration, noted as Random-Access Hierarchical Video Coding (RA-HVC), where different frames are assigned to different temporal layers and further coded with different coding parameters. Due to the inter-frame prediction, coding result of one unit may affect the coding performance of the following temporally related units. Therefore, the temporal dependency among units needs to be considered in the coding process. However, the RDO process in the current video codec is performed without considering the varying temporal dependency, thus compromising the rate-distortion performance significantly. To address this problem, a layer-based temporal dependent RDO method is proposed in this paper where the temporal dependency among different frames in the same or different layers is examined. By reformulating the temporal dependent RDO for the RA-HVC, we show that it can be implemented in a way of simply refining the Lagrange multiplier. Experimental results show that the proposed method achieves, in average, about 1.4% BD-rate savings with a negligible increase in encoding time for the random-access configuration.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121034428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Reliability-based mesh-to-grid image reconstruction 基于可靠性的网格到网格图像重建

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813344

Ján Koloda, Jürgen Seiler, André Kaup

引用次数: 2

Assessment of sound source localization of an intra-aural audio wearable device for audio augmented reality applications 用于音频增强现实应用的耳内音频可穿戴设备的声源定位评估

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813370

Narimene Lezzoum, J. Voix

{"title":"Assessment of sound source localization of an intra-aural audio wearable device for audio augmented reality applications","authors":"Narimene Lezzoum, J. Voix","doi":"10.1109/MMSP.2016.7813370","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813370","url":null,"abstract":"This paper presents a study on the effect of an intra-aural Audio Wearable Device (AWD) on sound source localization. The AWD used in this study is equipped with a miniature outer-ear microphone, a miniature digital signal processor and a miniature internal loudspeaker in addition to other electronics. It is aimed for audio augmented reality applications, for example to play 3D audio sounds and stimuli while keeping the wearer protected from loud or unwanted ambient noise. This AWD is evaluated in terms of ambient source localization using three localization cues computed using signals played from different positions in the horizontal and sagittal planes and recorded in the ear canals of an artificial head with and without a pair of AWDs. The localization cues are: the inter-aural time difference, the inter-aural level difference, and the head related transfer functions. Results showed that the used AWD does barely affect the localization of low frequency sounds with localization error around 2°, and only affects the localization of higher frequency sound depending on their position and frequency range.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131058161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimizing tone mapping operators for keypoint detection under illumination changes 优化光照变化下关键点检测的音调映射算子

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813340

A. Rana, G. Valenzise, F. Dufaux

{"title":"Optimizing tone mapping operators for keypoint detection under illumination changes","authors":"A. Rana, G. Valenzise, F. Dufaux","doi":"10.1109/MMSP.2016.7813340","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813340","url":null,"abstract":"Tone mapping operators (TMO) have recently raised interest for their capability to handle illumination changes. However, these TMOs are optimized with respect to perception rather than image analysis tasks like key point detection. Moreover, no work has been done to analyze the factors affecting the optimization of TMOs for such tasks. In this paper, we investigate the influence of two factors-Correlation Coefficient (CC) and Repeatability Rate (RR) of the tone mapped images for the optimization of classical Retinex based models to enhance key point detection under illumination changes. CC-based optimized models aim at increasing the similarity of the tone mapped images. Conversely, RR-based optimized models quantify the optimal detection performance gains. By considering two simple Retinex based models, i.e., Gaussian and bilateral filtering, we show that estimating as precisely as possible the illumination, CC-based optimized models do not necessarily bring to optimal key point detection performance. We conclude that, instead, other criteria specific to RR-based optimized models should be taken into account. Moreover, large gains in performance with respect to existing popular TMOs motivate further research towards optimal tone mapping technique for computer vision applications.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132086894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11