Proceedings of the ACM Multimedia Asia最新文献_第4页

Session details: Multimedia Service 会话详细信息:多媒体服务

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3379192

T. Yamasaki

引用次数: 0

Surface Normal Data Guided Depth Recovery with Graph Laplacian Regularization 基于图拉普拉斯正则化的地表法向数据引导深度恢复

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366582

Longhua Sun, Jin Wang, Yunhui Shi, Qing Zhu, Baocai Yin

{"title":"Surface Normal Data Guided Depth Recovery with Graph Laplacian Regularization","authors":"Longhua Sun, Jin Wang, Yunhui Shi, Qing Zhu, Baocai Yin","doi":"10.1145/3338533.3366582","DOIUrl":"https://doi.org/10.1145/3338533.3366582","url":null,"abstract":"High-quality depth information has been increasingly used in many real-world multimedia applications in recent years. Due to the limitation of depth sensor and sensing technology, actually, the captured depth map usually has low resolution and black holes. In this paper, inspired by the geometric relationship between surface normal of a 3D scene and their distance from camera, we discover that surface normal map can provide more spatial geometric constraints for depth map reconstruction, as depth map is a special image with spatial information, which we called 2.5D image. To exploit this property, we propose a novel surface normal data guided depth recovery method, which uses surface normal data and observed depth value to estimate missing or interpolated depth values. Moreover, to preserve the inherent piecewise smooth characteristic of depth maps, graph Laplacian prior is applied to regularize the inverse problem of depth maps recovery and a graph Laplacian regularizer(GLR) is proposed. Finally, the spatial geometric constraint and graph Laplacian regularization are integrated into a unified optimization framework, which can be efficiently solved by conjugate gradient(CG). Extensive quantitative and qualitative evaluations compared with state-of-the-art schemes show the effectiveness and superiority of our method.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122328075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Color Recovery from Multi-Spectral NIR Images Using Gray Information 利用灰色信息从多光谱近红外图像中恢复颜色

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3338533.3368259

Qingtao Fu, Cheolkon Jung, Chen Su

引用次数: 6

Generalizing Rate Control Strategies for Realtime Video Streaming via Learning from Deep Learning 基于深度学习的实时视频流泛化速率控制策略

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366606

Tianchi Huang, Ruixiao Zhang, Chenglei Wu, Xin Yao, Chao Zhou, Bing Yu, Lifeng Sun

{"title":"Generalizing Rate Control Strategies for Realtime Video Streaming via Learning from Deep Learning","authors":"Tianchi Huang, Ruixiao Zhang, Chenglei Wu, Xin Yao, Chao Zhou, Bing Yu, Lifeng Sun","doi":"10.1145/3338533.3366606","DOIUrl":"https://doi.org/10.1145/3338533.3366606","url":null,"abstract":"The leading learning-based rate control method, i.e., QARC, achieves state-of-the-art performances but fails to interpret the fundamental principles, and thus lacks the abilities to further improve itself efficiently. In this paper, we propose EQARC (Explainable QARC) via reconstructing QARC's modules, aiming to demystify how QARC works. In details, we first utilize a novel hybrid attention-based CNN+GRU model to re-characterize the original quality prediction network and reasonably replace the QARC's 1D-CNN layers with 2D-CNN layers. Using trace-driven experiment, we demonstrate the superiority of EQARC over existing state-of-the-art approaches. Next, we collect several useful information from each interpretable modules and learn the insight of EQARC. Following this step, we further propose AQARC (Advanced QARC), which is the light-weighted version of QARC. Experimental results show that AQARC achieves the same performances as the QARC with an overhead reduction of 90%. In short, through learning from deep learning, we generalize a rate control method which can both reach high performance and reduce computation cost.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133029202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

An Adaptive Dark Region Detail Enhancement Method for Low-light Images 弱光图像的自适应暗区细节增强方法

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366584

Wengang Cheng, Caiyun Guo, Haitao Hu

引用次数: 0

Learn to Gesture: Let Your Body Speak 学会做手势:让你的身体说话

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366602

Tian Gan, Zhixin Ma, Yu Lu, Xuemeng Song, Liqiang Nie

{"title":"Learn to Gesture: Let Your Body Speak","authors":"Tian Gan, Zhixin Ma, Yu Lu, Xuemeng Song, Liqiang Nie","doi":"10.1145/3338533.3366602","DOIUrl":"https://doi.org/10.1145/3338533.3366602","url":null,"abstract":"Presentation is one of the most important and vivid methods to deliver information to audience. Apart from the content of presentation, how the speaker behaves during presentation makes a big difference. In other words, gestures, as part of the visual perception and synchronized with verbal information, express some subtle information that the voice or words alone cannot deliver. One of the most effective ways to improve presentation is to practice through feedback/suggestions by an expert. However, hiring human experts is expensive thus impractical most of the time. Towards this end, we propose a speech to gesture network (POSE) to generate exemplary body language given a vocal behavior speech as input. Specifically, we build an \"expert\" Speech-Gesture database based on the featured TED talk videos, and design a two-layer attentive recurrent encoder-decoder network to learn the translation from speech to gesture, as well as the hierarchical structure within gestures. Lastly, given a speech audio sequence, the appropriate gesture will be generated and visualized for a more effective communication. Both objective and subjective validation show the effectiveness of our proposed method.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131121919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Structural Feature Learning: Re-Identification of simailar vehicles In Structure-Aware Map Space 深度结构特征学习:结构感知地图空间中相似车辆的再识别

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366585

Wenqian Zhu, R. Hu, Zhongyuan Wang, Dengshi Li, Xiyue Gao

{"title":"Deep Structural Feature Learning: Re-Identification of simailar vehicles In Structure-Aware Map Space","authors":"Wenqian Zhu, R. Hu, Zhongyuan Wang, Dengshi Li, Xiyue Gao","doi":"10.1145/3338533.3366585","DOIUrl":"https://doi.org/10.1145/3338533.3366585","url":null,"abstract":"Vehicle re-identification (re-ID) has received more attention in recent years as a significant work, making huge contribution to the intelligent video surveillance. The complex intra-class and inter-class variation of vehicle images bring huge challenges for vehicle re-ID, especially for the similar vehicle re-ID. In this paper we focus on an interesting and challenging problem, vehicle re-ID of the same/similar model. Previous works mainly focus on extracting global features using deep models, ignoring the individual loa-cal regions in vehicle front window, such as decorations and stickers attached to the windshield, that can be more discriminative for vehicle re-ID. Instead of directly embedding these regions to learn their features, we propose a Regional Structure-Aware model (RSA) to learn structure-aware cues with the position distribution of individual local regions in vehicle front window area, constructing a FW structural map space. In this map sapce, deep models are able to learn more robust and discriminative spatial structure-aware features to improve the performance for vehicle re-ID of the same/similar model. We evaluate our method on a large-scale vehicle re-ID dataset Vehicle-1M. The experimental results show that our method can achieve promising performance and outperforms several recent state-of-the-art approaches.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"102 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132329156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Session details: Vision in Multimedia 会议详情:多媒体中的视觉

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3379197

H. Hang

引用次数: 0

Comprehensive Event Storyline Generation from Microblogs 从微博中生成综合事件故事线

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366601

Wenjin Sun, Yuhang Wang, Yuqi Gao, Zesong Li, J. Sang, Jian Yu

{"title":"Comprehensive Event Storyline Generation from Microblogs","authors":"Wenjin Sun, Yuhang Wang, Yuqi Gao, Zesong Li, J. Sang, Jian Yu","doi":"10.1145/3338533.3366601","DOIUrl":"https://doi.org/10.1145/3338533.3366601","url":null,"abstract":"Microblogging data contains a wealth of information of trending events and has gained increased attention among users, organizations, and research scholars for social media mining in different disciplines. Event storyline generation is one typical task of social media mining, whose goal is to extract the development stages with associated description of events. Existing storyline generation methods either generate storyline with less integrity or fail to guarantee the coherence between the discovered stages. Secondly, there are no scientific method to evaluate the quality of the storyline. In this paper, we propose a comprehensive storyline generation framework to address the above disadvantages. Given Microblogging data related to the specified event, we first propose Hot-Word-Based stage detection algorithm to identify the potential stages of event, which can effectively avoid ignoring important stages and preventing inconsistent sequence between stages. Community detection algorithm is applied then to select representative data for each stage. Finally, we conduct graph optimization algorithm to generate the logically coherent storylines of the event. We also introduce a new evaluation metric, SLEU, to emphasize the importance of the integrity and coherence of the generated storyline. Extensive experiments on real-world Chinese microblogging data demonstrate the effectiveness of the proposed methods in each module and the overall framework.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"201 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116159636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Semantic Prior Guided Face Inpainting 语义先验引导面部彩绘

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366587

Zeyang Zhang, Xiaobo Zhou, Shengjie Zhao, Xiaoyan Zhang

引用次数: 6