2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)最新文献

筛选
英文 中文
Test Zonal Search Based on Region Label (TZSR) for Motion Estimation in HEVC 基于区域标签(TZSR)的区域搜索在HEVC运动估计中的应用
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547127
Iris Linck, A. T. Gómez, G. Alaghband
{"title":"Test Zonal Search Based on Region Label (TZSR) for Motion Estimation in HEVC","authors":"Iris Linck, A. T. Gómez, G. Alaghband","doi":"10.1109/MMSP.2018.8547127","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547127","url":null,"abstract":"This paper presents a new complexity Reduction method for the diamond search pattern called TZSR based on region labels in HEVC/H.265 video coding. The solution introduces a new image region structure developed from a simplified version of the blob coloring algorithm (image labeling) to HEVC/H.265 video coding. Regions are either whole or part of image objects and normally span several coding tree blocks that are produced during HEVC encoding. Our method executes a complete Diamond Search (DS) for the first block of each region in order to identify the motion vector direction among eight different directions in DS. The motion estimation (ME) for the rest of the blocks in the region will perform a modified DS where only one direction point for various distances will be tested in order to reduce the code complexity. Experimental results demonstrate that the speedup achieved in our solution surpasses the time spent in our blob coloring algorithm. Furthermore, TZSR achieves an average speedup of 42.61% for low delay (LD) configuration and 52.13% for random access (RA) in the encoding time compared to the original ME algorithm in HEVC reference software (HM-16.7) with overall gains in PSNR (Peak Signal-to-Noise Ratio) and bit rate around 18.67% and 0.1 under LD and 28.37% and 0.74 under RA respectively.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"392 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133546911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
3-Stream Convolutional Networks for Video Action Recognition with Hybrid Motion Field 基于混合运动场的三流卷积网络视频动作识别
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547088
Wukui Yang, Shan Gao, Wenran Liu, Xiangyang Ji
{"title":"3-Stream Convolutional Networks for Video Action Recognition with Hybrid Motion Field","authors":"Wukui Yang, Shan Gao, Wenran Liu, Xiangyang Ji","doi":"10.1109/MMSP.2018.8547088","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547088","url":null,"abstract":"Two-stream based architectures for video action recognition exhibit great success recently. They encode the appearance with RGB frame, and the motion with optical flow. It is observed that optical flow depicts pixel-level motion field, focusing much on detail information, is hard to tackle the large displacement. In fact, human always focus the global motion rather than pixel-level motion. Inspired by this, we propose a novel 3-stream network structure with a spatial ConvNet, a pixel-level temporal ConvNet and a block-level temporal ConvNet. Integrating multi-granularity motion representation significantly outperforms single pixel-level motion field based architectures. Further, we can obtain the block-level motion vector field from compressed videos without extra calculation. We address missing and noisy motion patterns of motion vector field with intra-encoded block rectifying and flow guided filtering, building a hybrid motion field for our block-level temporal ConvNet. Our approach obtains state-of-the-art accuracy on UCF101 (95.27%) and HMDB 51 (69.21 %).","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123943419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
User-Independent Detection of Swipe Pressure Using a Thermal Camera for Natural Surface Interaction 使用热像仪对自然表面相互作用的滑动压力进行用户独立检测
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547052
Tim Dunn, Sean Banerjee, N. Banerjee
{"title":"User-Independent Detection of Swipe Pressure Using a Thermal Camera for Natural Surface Interaction","authors":"Tim Dunn, Sean Banerjee, N. Banerjee","doi":"10.1109/MMSP.2018.8547052","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547052","url":null,"abstract":"In this paper, we use a thermal camera to distinguish hard and soft swipes performed by a user interacting with a natural surface by detecting differences in the thermal signature of the surface due to heat transferred by the user. Unlike prior work, our approach provides swipe pressure classifiers that are user-agnostic, i.e., that recognize the swipe pressure of a novel user not present in the training set, enabling our work to be ported into natural user interfaces without user-specific calibration. Our approach generates average classification accuracy of 76% using random forest classifiers trained on a test set of 9 subjects interacting with paper and wood, with 8 hard and 8 soft test swipes per user. We compare results of the user-agnostic classification to user-aware classification with classifiers trained by including training samples from the user. We obtain average user-aware classification accuracy of 82% by adding up to 8 hard and 8 soft training swipes for each test user. Our approach enables seamless adaptation of generic pressure classification systems based on thermal data to the specific behavior of users interacting with natural user interfaces.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115555854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A New Retrieval System Based on Low Dynamic Range Expansion and SIFT Descriptor 基于低动态范围扩展和SIFT描述符的检索系统
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547089
Raoua Khwildi, A. O. Zaid
{"title":"A New Retrieval System Based on Low Dynamic Range Expansion and SIFT Descriptor","authors":"Raoua Khwildi, A. O. Zaid","doi":"10.1109/MMSP.2018.8547089","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547089","url":null,"abstract":"When comparing the intelligibility of Low Dynamic Range (LDR) content with that of the human visual system (HSV), we find that the former is quite limited. This is straightforward, as LDR technologies could only handle 8 or 12-bit per color channel. This being said, LDR images are still used in a wide range of multimedia applications. This paper presents a solution for efficiently indexing LDR content by introducing a novel descriptor based on LDR content expansion. To increase the richness of features that are strongly dependent on the illumination of the scene, the LDR image is converted to High Dynamic Range (HDR) one using reverse Tone Mapping Operator (rTMO). The result HDR image is in turn tone mapped and the relevant features are determined according the Scale Invariant Feature Transform (SIFT) descriptor. After that, the obtained features are gathered into a vector using Bag-of-Visual-Word (BoVW) strategy. A set of routine benchmarking experiments utilizing the Wang and Pascal Voc databases indicates that our system performs well for image retrieval. These experiments also demonstrate that features extracted from reverse tone mapped and tone mapped image are more descriptive than those extracted from LDR and HDR contents.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126797846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Heterogeneous Spatial Quality for Omnidirectional Video 全向视频的异构空间质量
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547114
Hristina Hristova, Xavier Corbillon, G. Simon, Viswanathan Swaminathan, A. Devlic
{"title":"Heterogeneous Spatial Quality for Omnidirectional Video","authors":"Hristina Hristova, Xavier Corbillon, G. Simon, Viswanathan Swaminathan, A. Devlic","doi":"10.1109/MMSP.2018.8547114","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547114","url":null,"abstract":"A video with heterogeneous spatial quality is a video where some regions of the frame have a different quality than other regions (for instance, a better quality could mean more pixels and less encoding distortion). Such a quality-variable encoding is a key enabler of Virtual Reality application, with 360-degree videos. So far, the main technique that has been proposed to prepare spatially heterogeneous quality is based on the concept of tiling. More recently, Facebook has implemented another approach: the offset projection where more emphasis is put on a specific direction of the frame. In this paper, we study quality-variable 360-degree videos with two main contributions. First, we provide the theoretical analysis of the offset projection and show the impact of the parameter settings on the video quality. Second, we propose another approach which consists in preparing the 360-degree video from a Gaussian pyramid of downscaled and blurred versions of the video. We perform an evaluation of tiling, offset and Gaussian-based approaches in representative scenarios of heterogeneous spatial quality in 360-degree videos and highlight the main trade-off to consider when implementing these approaches.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"C-34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126491314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Image Forensics in Online News 在线新闻中的图像取证
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547083
Federica Lago, Quoc-Tin Phan, G. Boato
{"title":"Image Forensics in Online News","authors":"Federica Lago, Quoc-Tin Phan, G. Boato","doi":"10.1109/MMSP.2018.8547083","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547083","url":null,"abstract":"Recognizing fake images in online news is a challenging problem. This is especially true in the case of critical situations, when journalists might insert high-impact images to make a piece of news more appealing to the readers, neglecting to check their authenticity and provenance. Given the importance of this task, in the literature, it is possible to find several attempts to solve the problem from different points of view. This paper faces the specific problem of recognizing images in online news which have been modified or mis-contextualized, i.e. images taken in a different place and/or time with respect to the event to which they are associated. To identify image tampering a number of image forensic techniques were exploited and combined. On the other hand, for mis-contextualization detection, a textual analysis approach is proposed based on the extraction of features from the news the image is associated with, and from textual information retrieved online using the image at stake as pivot. The obtained results are rather satisfactory on laboratory data, with results that in some cases improve the state of the art for image forensics. The method was tested on three datasets, one of which already used in the literature, while the others created ad-hoc to further investigate its performances.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115176308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Measuring Ear-Canal Reflectance and Estimating Ear-Canal Area Functions and Eardrum Reflectance 测量耳道反射率和估计耳道区域功能和耳膜反射率
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547071
Huiqun Deng
{"title":"Measuring Ear-Canal Reflectance and Estimating Ear-Canal Area Functions and Eardrum Reflectance","authors":"Huiqun Deng","doi":"10.1109/MMSP.2018.8547071","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547071","url":null,"abstract":"Ear-canal reflectance contains information about ear-canal cross-sectional area functions and eardrum reflectance. These are important in audiology and the design of modern hearing aids and headphones. This paper investigates the inversion method for jointly estimating ear-canal cross-sectional area function and eardrum reflectance, given the acoustic reflectance measured at the entrance of the ear canal. It is found through physical experiments and simulations that the estimated cross-sectional area function is spatially band-limited to $2F_{c}/c$, where $pmb{F}_{c}$ is the frequency bandwidth of the low-pass filtered reflectance used in the inversion, and $c$ is the speed of sound. If the actual spatial bandwidth of the area function under estimation is higher than this, then Gibbs ripples appear in the estimated area function. A method is presented for the accurate measurements of ear-canal reflectance, and results are presented for two subjects along with the estimated ear-canal area functions and eardrum reflectance.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133238717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Planar Microphone Array for Spatial Coherence-Based Source Separation 基于空间相干源分离的平面传声器阵列
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547121
Abdullah Fahim, P. Samarasinghe, T. Abhayapala, Hanchi Chen
{"title":"A Planar Microphone Array for Spatial Coherence-Based Source Separation","authors":"Abdullah Fahim, P. Samarasinghe, T. Abhayapala, Hanchi Chen","doi":"10.1109/MMSP.2018.8547121","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547121","url":null,"abstract":"We proposed a spatial coherence-based PSD estimation and source separation technique in [1] using a 32-channel spherical microphone array. While the proposed spherical microphone-based method exhibited a satisfactory performance in separating multiple sound sources in a reverberant environment, the use of a large number of microphones remains an issue for some practical considerations. In this paper, we investigate an alternative array structure to achieve spatial coherence-based source separation using a planar microphone array. This method is particularly useful in separating a limited number of sound sources in a mixed acoustic scene. The simplified array structure we used here can easily be integrated with many commercial acoustical instruments such as smart home devices to achieve better speech enhancements.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129038384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-Local Super Resolution in Ultrasound Imaging 超声成像中的非局部超分辨率
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547090
P. Khavari, A. Asif, H. Rivaz
{"title":"Non-Local Super Resolution in Ultrasound Imaging","authors":"P. Khavari, A. Asif, H. Rivaz","doi":"10.1109/MMSP.2018.8547090","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547090","url":null,"abstract":"The resolution of ultrasound (US) images is limited by physical constraints and hardware restrictions, such as the frequency, width and focal zone of the US beam. Different interpolation methods are often used to increase the sampling rate of ultrasound images. However, interpolation methods generally introduce blur in images. Herein, we present a super resolution (SR) algorithm for reconstruction of the B-mode images using the information from the envelope of radio frequency (RF) data. Our method is based on utilizing repetitive data in the nonlocal neighborhood of samples. The performance of the proposed approach is determined both qualitatively and quantitatively using phantom and in-vivo data.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123405758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Efficient Object Tracking in Compressed Video Streams with Graph Cuts 使用图形切割的压缩视频流中的有效对象跟踪
2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2018-08-01 DOI: 10.1109/MMSP.2018.8547120
Fernando Bombardelli da Silva, Serhan Gul, Daniel Becker, Matthias Schmidt, C. Hellge
{"title":"Efficient Object Tracking in Compressed Video Streams with Graph Cuts","authors":"Fernando Bombardelli da Silva, Serhan Gul, Daniel Becker, Matthias Schmidt, C. Hellge","doi":"10.1109/MMSP.2018.8547120","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547120","url":null,"abstract":"In this paper we present a compressed-domain object tracking algorithm for H.264/AVC compressed videos and integrate the proposed algorithm into an indoor vehicle tracking scenario at a car park. Our algorithm works by taking an initial segmentation map or bounding box of the target object in the first frame of the video sequence as input and applying Graph Cuts optimization based on a Markov Random Field model. Our algorithm does not rely on pixels (except for the first frame) and works by only using the codec motion vectors and block coding modes extracted from the H.264/AVC bitstream via inexpensive partial decoding. In this way, we manage to reduce the compute and storage requirements of our system significantly compared to “pixel-domain” tracking algorithms that first fully decode the video stream and work on reconstructed pixels. We demonstrate the quantitative performance of our algorithm over VOT2016 dataset and also integrate our algorithm into a camera-based parking management system and show qualitative results in a real application scenario. Results show that our compressed-domain algorithm provides a good compromise between high accuracy tracking and low-complexity processing showing that it is feasible for scenarios requiring large-scale object tracking in bandwidth-limited conditions.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127243237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信