2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)最新文献

筛选
英文 中文
Video temporal super-resolution using nonlocal registration and self-similarity 基于非局部配准和自相似的视频时间超分辨率
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813400
Matteo Maggioni, P. Dragotti
{"title":"Video temporal super-resolution using nonlocal registration and self-similarity","authors":"Matteo Maggioni, P. Dragotti","doi":"10.1109/MMSP.2016.7813400","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813400","url":null,"abstract":"In this paper we present a novel temporal super-resolution method for increasing the frame-rate of single videos. The proposed algorithm is based on motion-compensated 3-D patches, i.e., a sequence of 2-D blocks following a given motion trajectory. The trajectories are computed through a coarse-to-fine motion estimation strategy embedding a regularized block-wise distance metric that takes into account the coherence of neighbouring motion vectors. Our algorithm comprises two stages. In the first stage, a nonlocal search procedure is used to find a set of 3-D patches (targets) similar to a given patch (reference), subsequently all targets are registered at sub-pixel precision with respect to the reference in an upsampled 3-D FFT domain, and finally all registered patches are aggregated at their appropriate locations in the high-resolution video. The second stage is used to further improve the estimation quality by correcting each 3-D patch of the video obtained from the first stage with a linear operator learned from the self-similarity of patches at a lower temporal scale. Our experimental evaluation on color videos shows that the proposed approach achieves high quality super-resolution results from both an objective and subjective point of view.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115293621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Affective states classification using EEG and semi-supervised deep learning approaches 基于EEG和半监督深度学习方法的情感状态分类
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813351
Haiyan Xu, K. Plataniotis
{"title":"Affective states classification using EEG and semi-supervised deep learning approaches","authors":"Haiyan Xu, K. Plataniotis","doi":"10.1109/MMSP.2016.7813351","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813351","url":null,"abstract":"Affective states of a user provide important information for many applications such as, personalized information (e.g., multimedia content) retrieval/delivery or intelligent human-computer interface design. In recently years, physiological signals, Electroencephalogram (EEG) in particular, have been shown to be very effective in estimating a user's affective states during social interaction or under video or audio stimuli. However, due to the large number of parameters associated with the neural expression of emotion, there is still a lot of unknowns on the specific spatial and spectral correlation of the EEG signal and the affective states expression. To investigate on such correlation, two types of semi-supervised deep learning approaches, stacked denoising autoencoder (SDAE) and deep belief networks (DBN), were applied as application specific feature extractors for the affective states classification problem using EEG signals. To evaluate the efficacy of the proposed semi-supervised approaches, a subject-specific affective states classification experiment were carried out on the DEAP database to classify 2-dimensional affect states. The DBN based model achieved averaged F1 scores of 86.67%, 86.60% and 86.69% for arousal, valence and liking states classification respectively, which has significantly improved the state-of-art classification performance. By examining the weight vectors at each layer, we were also able to gain insights on the spatial or spectral locations of the most discriminating features. Another main advantage of applying the semi-supervised learning methods is that only a small fraction of labeled data, e.g., 1/6 of the training samples, were used in this study.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123368898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Detection of fake 3D video using CNN 利用CNN检测假3D视频
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813368
Shuvendu Rana, S. Gaj, A. Sur, P. Bora
{"title":"Detection of fake 3D video using CNN","authors":"Shuvendu Rana, S. Gaj, A. Sur, P. Bora","doi":"10.1109/MMSP.2016.7813368","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813368","url":null,"abstract":"In this paper, a novel automatic fake and the real 3D video recognition scheme is proposed to distinguish the 3D video converted from the 2D video using 2D to 3D conversion process (say fake 3D) from the 3D video captured using direct capturing of the 3D camera (say real 3D). To identify the real and fake 3D, pre-filtration is done using the dual tree complex wavelet transform to emerge the edge and vertical and horizontal parallax characteristics of real and fake 3D videos. Convolution neural network (CNN) is used to train the 3D characteristics to distinguish the fake 3D videos from the real ones. A comprehensive set of experiments has been carried out to justify the efficacy of the proposed scheme over the existing literature.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122381912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Optimised selection of structure of pictures for video coding 优化了视频编码的图片结构选择
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813359
Vigneswaran Poobalasingam, E. Izquierdo, Saverio G. Blasi, M. Mrak
{"title":"Optimised selection of structure of pictures for video coding","authors":"Vigneswaran Poobalasingam, E. Izquierdo, Saverio G. Blasi, M. Mrak","doi":"10.1109/MMSP.2016.7813359","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813359","url":null,"abstract":"Encoders based on the High Efficiency Video Coding (HEVC) standard consider an input sequence as a succession of slices grouped in Structures of Pictures (SOP). The SOP used while encoding specifies many parameters, such as the coding order of frames, or the reference frames used during inter-prediction. Reference encoders typically make use of a fixed SOP structure of a given size, which is periodically repeated throughout the whole sequence. In this paper, the usage of unconventional SOP structures is first analysed, showing that most sequences benefit from usage of larger SOPs, and that the selection of the optimal SOP is highly content dependent. As a result, an algorithm is proposed to automatically select the optimal SOP size based on a low-complexity texture analysis of neighbouring frames. The algorithm is capable of adaptively changing SOP size during the encoding. Extensive evaluation shows that consistent bit-rate reductions are reported at the same objective quality as an effect of using the proposed algorithm.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125331930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An embedded 3D geometry score for mobile 3D visual search 一个嵌入式3D几何分数的移动3D视觉搜索
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813366
Hanwei Wu, Haopeng Li, M. Flierl
{"title":"An embedded 3D geometry score for mobile 3D visual search","authors":"Hanwei Wu, Haopeng Li, M. Flierl","doi":"10.1109/MMSP.2016.7813366","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813366","url":null,"abstract":"The scoring function is a central component in mobile visual search. In this paper, we propose an embedded 3D geometry score for mobile 3D visual search (M3DVS). In contrast to conventional mobile visual search, M3DVS uses not only the visual appearance of query objects, but utilizes also the underlying 3D geometry. The proposed scoring function interprets visual search as a process that reduces uncertainty among candidate objects when observing a query. For M3DVS, the uncertainty is reduced by both appearance-based visual similarity and 3D geometric similarity. For the latter, we give an algorithm for estimating the query-dependent threshold for geometric similarity. In contrast to visual similarity, the threshold for geometric similarity is relative due to the constraints of image-based 3D reconstruction. The experimental results show that the embedded 3D geometry score improves the recall-data rate performance when compared to a conventional visual score or 3D geometry-based re-ranking.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114293790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-layer large-scale cover song identification system based on music structure segmentation 基于音乐结构分割的双层大型翻唱歌曲识别系统
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813372
Kang Cai, Deshun Yang, Xiaoou Chen
{"title":"Two-layer large-scale cover song identification system based on music structure segmentation","authors":"Kang Cai, Deshun Yang, Xiaoou Chen","doi":"10.1109/MMSP.2016.7813372","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813372","url":null,"abstract":"This paper focuses on cover song identification over a large-scale dataset. Identifying all covers of a query song from music collection is a challenging task since covers vary in multiple aspects, such as tempo, key, and structure. For the large-scale dataset, cover song identification is more challenging and few works have been published. Previous works usually use a single representation for a whole song, such as 2D Fourier transform and chord profiles, which cannot reflect the property that covers are largely determined by a local similarity. To address this problem, we propose a novel cover song identification method based on music structure segmentation. The proposed structural method identifies cover songs on section level instead of song level. The experimental results show that the structural method improves the mean average precision of 2D Fourier transform method from 9.5% to 12.1%. In addition, we also propose a two-layer cover song identification system to improve the efficiency.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130218695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Layer-based temporal dependent rate-distortion optimization in Random-Access hierarchical video coding 随机访问分层视频编码中基于层的时变率失真优化
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813408
Yanbo Gao, Ce Zhu, Shuai Li, Tianwu Yang
{"title":"Layer-based temporal dependent rate-distortion optimization in Random-Access hierarchical video coding","authors":"Yanbo Gao, Ce Zhu, Shuai Li, Tianwu Yang","doi":"10.1109/MMSP.2016.7813408","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813408","url":null,"abstract":"Rate-distortion optimization (RDO) plays an important part in improving the coding efficiency of High Efficiency Video Coding (HEVC), especially for the hierarchical coding structure defined in the Random-Access (RA) configuration, noted as Random-Access Hierarchical Video Coding (RA-HVC), where different frames are assigned to different temporal layers and further coded with different coding parameters. Due to the inter-frame prediction, coding result of one unit may affect the coding performance of the following temporally related units. Therefore, the temporal dependency among units needs to be considered in the coding process. However, the RDO process in the current video codec is performed without considering the varying temporal dependency, thus compromising the rate-distortion performance significantly. To address this problem, a layer-based temporal dependent RDO method is proposed in this paper where the temporal dependency among different frames in the same or different layers is examined. By reformulating the temporal dependent RDO for the RA-HVC, we show that it can be implemented in a way of simply refining the Lagrange multiplier. Experimental results show that the proposed method achieves, in average, about 1.4% BD-rate savings with a negligible increase in encoding time for the random-access configuration.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121034428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Reliability-based mesh-to-grid image reconstruction 基于可靠性的网格到网格图像重建
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813344
Ján Koloda, Jürgen Seiler, André Kaup
{"title":"Reliability-based mesh-to-grid image reconstruction","authors":"Ján Koloda, Jürgen Seiler, André Kaup","doi":"10.1109/MMSP.2016.7813344","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813344","url":null,"abstract":"This paper presents a novel method for the reconstruction of images from samples located at non-integer positions, called mesh. This is a common scenario for many image processing applications, such as super-resolution, warping or virtual view generation in multi-camera systems. The proposed method relies on a set of initial estimates that are later refined by a new reliability-based content-adaptive framework that employs denoising in order to reduce the reconstruction error. The reliability of the initial estimate is computed so stronger denoising is applied to less reliable estimates. The proposed technique can improve the reconstruction quality by more than 2 dB (in terms of PSNR) with respect to the initial estimate and it outperforms the state-of-the-art denoising-based refinement by up to 0.7 dB.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126664277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Assessment of sound source localization of an intra-aural audio wearable device for audio augmented reality applications 用于音频增强现实应用的耳内音频可穿戴设备的声源定位评估
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813370
Narimene Lezzoum, J. Voix
{"title":"Assessment of sound source localization of an intra-aural audio wearable device for audio augmented reality applications","authors":"Narimene Lezzoum, J. Voix","doi":"10.1109/MMSP.2016.7813370","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813370","url":null,"abstract":"This paper presents a study on the effect of an intra-aural Audio Wearable Device (AWD) on sound source localization. The AWD used in this study is equipped with a miniature outer-ear microphone, a miniature digital signal processor and a miniature internal loudspeaker in addition to other electronics. It is aimed for audio augmented reality applications, for example to play 3D audio sounds and stimuli while keeping the wearer protected from loud or unwanted ambient noise. This AWD is evaluated in terms of ambient source localization using three localization cues computed using signals played from different positions in the horizontal and sagittal planes and recorded in the ear canals of an artificial head with and without a pair of AWDs. The localization cues are: the inter-aural time difference, the inter-aural level difference, and the head related transfer functions. Results showed that the used AWD does barely affect the localization of low frequency sounds with localization error around 2°, and only affects the localization of higher frequency sound depending on their position and frequency range.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131058161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing tone mapping operators for keypoint detection under illumination changes 优化光照变化下关键点检测的音调映射算子
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813340
A. Rana, G. Valenzise, F. Dufaux
{"title":"Optimizing tone mapping operators for keypoint detection under illumination changes","authors":"A. Rana, G. Valenzise, F. Dufaux","doi":"10.1109/MMSP.2016.7813340","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813340","url":null,"abstract":"Tone mapping operators (TMO) have recently raised interest for their capability to handle illumination changes. However, these TMOs are optimized with respect to perception rather than image analysis tasks like key point detection. Moreover, no work has been done to analyze the factors affecting the optimization of TMOs for such tasks. In this paper, we investigate the influence of two factors-Correlation Coefficient (CC) and Repeatability Rate (RR) of the tone mapped images for the optimization of classical Retinex based models to enhance key point detection under illumination changes. CC-based optimized models aim at increasing the similarity of the tone mapped images. Conversely, RR-based optimized models quantify the optimal detection performance gains. By considering two simple Retinex based models, i.e., Gaussian and bilateral filtering, we show that estimating as precisely as possible the illumination, CC-based optimized models do not necessarily bring to optimal key point detection performance. We conclude that, instead, other criteria specific to RR-based optimized models should be taken into account. Moreover, large gains in performance with respect to existing popular TMOs motivate further research towards optimal tone mapping technique for computer vision applications.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132086894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信