2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)最新文献_第3页

Test Zonal Search Based on Region Label (TZSR) for Motion Estimation in HEVC 基于区域标签(TZSR)的区域搜索在HEVC运动估计中的应用

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547127

Iris Linck, A. T. Gómez, G. Alaghband

{"title":"Test Zonal Search Based on Region Label (TZSR) for Motion Estimation in HEVC","authors":"Iris Linck, A. T. Gómez, G. Alaghband","doi":"10.1109/MMSP.2018.8547127","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547127","url":null,"abstract":"This paper presents a new complexity Reduction method for the diamond search pattern called TZSR based on region labels in HEVC/H.265 video coding. The solution introduces a new image region structure developed from a simplified version of the blob coloring algorithm (image labeling) to HEVC/H.265 video coding. Regions are either whole or part of image objects and normally span several coding tree blocks that are produced during HEVC encoding. Our method executes a complete Diamond Search (DS) for the first block of each region in order to identify the motion vector direction among eight different directions in DS. The motion estimation (ME) for the rest of the blocks in the region will perform a modified DS where only one direction point for various distances will be tested in order to reduce the code complexity. Experimental results demonstrate that the speedup achieved in our solution surpasses the time spent in our blob coloring algorithm. Furthermore, TZSR achieves an average speedup of 42.61% for low delay (LD) configuration and 52.13% for random access (RA) in the encoding time compared to the original ME algorithm in HEVC reference software (HM-16.7) with overall gains in PSNR (Peak Signal-to-Noise Ratio) and bit rate around 18.67% and 0.1 under LD and 28.37% and 0.74 under RA respectively.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"392 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133546911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

3-Stream Convolutional Networks for Video Action Recognition with Hybrid Motion Field 基于混合运动场的三流卷积网络视频动作识别

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547088

Wukui Yang, Shan Gao, Wenran Liu, Xiangyang Ji

引用次数: 2

User-Independent Detection of Swipe Pressure Using a Thermal Camera for Natural Surface Interaction 使用热像仪对自然表面相互作用的滑动压力进行用户独立检测

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547052

Tim Dunn, Sean Banerjee, N. Banerjee

{"title":"User-Independent Detection of Swipe Pressure Using a Thermal Camera for Natural Surface Interaction","authors":"Tim Dunn, Sean Banerjee, N. Banerjee","doi":"10.1109/MMSP.2018.8547052","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547052","url":null,"abstract":"In this paper, we use a thermal camera to distinguish hard and soft swipes performed by a user interacting with a natural surface by detecting differences in the thermal signature of the surface due to heat transferred by the user. Unlike prior work, our approach provides swipe pressure classifiers that are user-agnostic, i.e., that recognize the swipe pressure of a novel user not present in the training set, enabling our work to be ported into natural user interfaces without user-specific calibration. Our approach generates average classification accuracy of 76% using random forest classifiers trained on a test set of 9 subjects interacting with paper and wood, with 8 hard and 8 soft test swipes per user. We compare results of the user-agnostic classification to user-aware classification with classifiers trained by including training samples from the user. We obtain average user-aware classification accuracy of 82% by adding up to 8 hard and 8 soft training swipes for each test user. Our approach enables seamless adaptation of generic pressure classification systems based on thermal data to the specific behavior of users interacting with natural user interfaces.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115555854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A New Retrieval System Based on Low Dynamic Range Expansion and SIFT Descriptor 基于低动态范围扩展和SIFT描述符的检索系统

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547089

Raoua Khwildi, A. O. Zaid

{"title":"A New Retrieval System Based on Low Dynamic Range Expansion and SIFT Descriptor","authors":"Raoua Khwildi, A. O. Zaid","doi":"10.1109/MMSP.2018.8547089","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547089","url":null,"abstract":"When comparing the intelligibility of Low Dynamic Range (LDR) content with that of the human visual system (HSV), we find that the former is quite limited. This is straightforward, as LDR technologies could only handle 8 or 12-bit per color channel. This being said, LDR images are still used in a wide range of multimedia applications. This paper presents a solution for efficiently indexing LDR content by introducing a novel descriptor based on LDR content expansion. To increase the richness of features that are strongly dependent on the illumination of the scene, the LDR image is converted to High Dynamic Range (HDR) one using reverse Tone Mapping Operator (rTMO). The result HDR image is in turn tone mapped and the relevant features are determined according the Scale Invariant Feature Transform (SIFT) descriptor. After that, the obtained features are gathered into a vector using Bag-of-Visual-Word (BoVW) strategy. A set of routine benchmarking experiments utilizing the Wang and Pascal Voc databases indicates that our system performs well for image retrieval. These experiments also demonstrate that features extracted from reverse tone mapped and tone mapped image are more descriptive than those extracted from LDR and HDR contents.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126797846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Heterogeneous Spatial Quality for Omnidirectional Video 全向视频的异构空间质量

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547114

Hristina Hristova, Xavier Corbillon, G. Simon, Viswanathan Swaminathan, A. Devlic

{"title":"Heterogeneous Spatial Quality for Omnidirectional Video","authors":"Hristina Hristova, Xavier Corbillon, G. Simon, Viswanathan Swaminathan, A. Devlic","doi":"10.1109/MMSP.2018.8547114","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547114","url":null,"abstract":"A video with heterogeneous spatial quality is a video where some regions of the frame have a different quality than other regions (for instance, a better quality could mean more pixels and less encoding distortion). Such a quality-variable encoding is a key enabler of Virtual Reality application, with 360-degree videos. So far, the main technique that has been proposed to prepare spatially heterogeneous quality is based on the concept of tiling. More recently, Facebook has implemented another approach: the offset projection where more emphasis is put on a specific direction of the frame. In this paper, we study quality-variable 360-degree videos with two main contributions. First, we provide the theoretical analysis of the offset projection and show the impact of the parameter settings on the video quality. Second, we propose another approach which consists in preparing the 360-degree video from a Gaussian pyramid of downscaled and blurred versions of the video. We perform an evaluation of tiling, offset and Gaussian-based approaches in representative scenarios of heterogeneous spatial quality in 360-degree videos and highlight the main trade-off to consider when implementing these approaches.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"C-34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126491314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Image Forensics in Online News 在线新闻中的图像取证

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547083

Federica Lago, Quoc-Tin Phan, G. Boato

{"title":"Image Forensics in Online News","authors":"Federica Lago, Quoc-Tin Phan, G. Boato","doi":"10.1109/MMSP.2018.8547083","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547083","url":null,"abstract":"Recognizing fake images in online news is a challenging problem. This is especially true in the case of critical situations, when journalists might insert high-impact images to make a piece of news more appealing to the readers, neglecting to check their authenticity and provenance. Given the importance of this task, in the literature, it is possible to find several attempts to solve the problem from different points of view. This paper faces the specific problem of recognizing images in online news which have been modified or mis-contextualized, i.e. images taken in a different place and/or time with respect to the event to which they are associated. To identify image tampering a number of image forensic techniques were exploited and combined. On the other hand, for mis-contextualization detection, a textual analysis approach is proposed based on the extraction of features from the news the image is associated with, and from textual information retrieved online using the image at stake as pivot. The obtained results are rather satisfactory on laboratory data, with results that in some cases improve the state of the art for image forensics. The method was tested on three datasets, one of which already used in the literature, while the others created ad-hoc to further investigate its performances.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115176308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Measuring Ear-Canal Reflectance and Estimating Ear-Canal Area Functions and Eardrum Reflectance 测量耳道反射率和估计耳道区域功能和耳膜反射率

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547071

Huiqun Deng

引用次数: 0

A Planar Microphone Array for Spatial Coherence-Based Source Separation 基于空间相干源分离的平面传声器阵列

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547121

Abdullah Fahim, P. Samarasinghe, T. Abhayapala, Hanchi Chen

引用次数: 0

Non-Local Super Resolution in Ultrasound Imaging 超声成像中的非局部超分辨率

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547090

P. Khavari, A. Asif, H. Rivaz

引用次数: 5

Efficient Object Tracking in Compressed Video Streams with Graph Cuts 使用图形切割的压缩视频流中的有效对象跟踪

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547120

Fernando Bombardelli da Silva, Serhan Gul, Daniel Becker, Matthias Schmidt, C. Hellge

{"title":"Efficient Object Tracking in Compressed Video Streams with Graph Cuts","authors":"Fernando Bombardelli da Silva, Serhan Gul, Daniel Becker, Matthias Schmidt, C. Hellge","doi":"10.1109/MMSP.2018.8547120","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547120","url":null,"abstract":"In this paper we present a compressed-domain object tracking algorithm for H.264/AVC compressed videos and integrate the proposed algorithm into an indoor vehicle tracking scenario at a car park. Our algorithm works by taking an initial segmentation map or bounding box of the target object in the first frame of the video sequence as input and applying Graph Cuts optimization based on a Markov Random Field model. Our algorithm does not rely on pixels (except for the first frame) and works by only using the codec motion vectors and block coding modes extracted from the H.264/AVC bitstream via inexpensive partial decoding. In this way, we manage to reduce the compute and storage requirements of our system significantly compared to “pixel-domain” tracking algorithms that first fully decode the video stream and work on reconstructed pixels. We demonstrate the quantitative performance of our algorithm over VOT2016 dataset and also integrate our algorithm into a camera-based parking management system and show qualitative results in a real application scenario. Results show that our compressed-domain algorithm provides a good compromise between high accuracy tracking and low-complexity processing showing that it is feasible for scenarios requiring large-scale object tracking in bandwidth-limited conditions.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127243237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4