Proceedings of the 24th ACM international conference on Multimedia最新文献

筛选
英文 中文
LSOD: Local Sparse Orthogonal Descriptor for Image Matching LSOD:图像匹配的局部稀疏正交描述子
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967217
Yiru Zhao, Yaoyi Li, Zhiwen Shao, Hongtao Lu
{"title":"LSOD: Local Sparse Orthogonal Descriptor for Image Matching","authors":"Yiru Zhao, Yaoyi Li, Zhiwen Shao, Hongtao Lu","doi":"10.1145/2964284.2967217","DOIUrl":"https://doi.org/10.1145/2964284.2967217","url":null,"abstract":"We propose a novel method for feature description used for image matching in this paper. Our method is inspired by the autoencoder, an artificial neural network designed for learning efficient codings. Sparse and orthogonal constraints are imposed on the autoencoder and make it a highly discriminative descriptor. It is shown that the proposed descriptor is not only invariant to geometric and photometric transformations (such as viewpoint change, intensity change, noise, image blur and JPEG compression), but also highly efficient. We compare it with existing state-of-the-art descriptors on standard benchmark datasets, the experimental results show that our LSOD method yields better performance both in accuracy and efficiency.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133240336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Accelerating Convolutional Neural Networks for Mobile Applications 为移动应用加速卷积神经网络
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967280
Peisong Wang, Jian Cheng
{"title":"Accelerating Convolutional Neural Networks for Mobile Applications","authors":"Peisong Wang, Jian Cheng","doi":"10.1145/2964284.2967280","DOIUrl":"https://doi.org/10.1145/2964284.2967280","url":null,"abstract":"Convolutional neural networks (CNNs) have achieved remarkable performance in a wide range of computer vision tasks, typically at the cost of massive computational complexity. The low speed of these networks may hinder real-time applications especially when computational resources are limited. In this paper, an efficient and effective approach is proposed to accelerate the test-phase computation of CNNs based on low-rank and group sparse tensor decomposition. Specifically, for each convolutional layer, the kernel tensor is decomposed into the sum of a small number of low multilinear rank tensors. Then we replace the original kernel tensors in all layers with the approximate tensors and fine-tune the whole net with respect to the final classification task using standard backpropagation. Comprehensive experiments on ILSVRC-12 demonstrate significant reduction in computational complexity, at the cost of negligible loss in accuracy. For the widely used VGG-16 model, our approach obtains a 6.6$times$ speed-up on PC and 5.91$times$ speed-up on mobile device of the whole network with less than 1% increase on top-5 error.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131842453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
Share-and-Chat: Achieving Human-Level Video Commenting by Search and Multi-View Embedding 分享和聊天:通过搜索和多视图嵌入实现人性化视频评论
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2964320
Yehao Li, Ting Yao, Tao Mei, Hongyang Chao, Y. Rui
{"title":"Share-and-Chat: Achieving Human-Level Video Commenting by Search and Multi-View Embedding","authors":"Yehao Li, Ting Yao, Tao Mei, Hongyang Chao, Y. Rui","doi":"10.1145/2964284.2964320","DOIUrl":"https://doi.org/10.1145/2964284.2964320","url":null,"abstract":"Video has become a predominant social media for the booming live interactions. Automatic generation of emotional comments to a video has great potential to significantly increase user engagement in many socio-video applications (e.g., chat bot). Nevertheless, the problem of video commenting has been overlooked by the research community. The major challenges are that the generated comments are to be not only as natural as those from human beings, but also relevant to the video content. We present in this paper a novel two-stage deep learning-based approach to automatic video commenting. Our approach consists of two components. The first component, similar video search, efficiently finds the visually similar videos w.r.t. a given video using approximate nearest-neighbor search based on the learned deep video representations, while the second dynamic ranking effectively ranks the comments associated with the searched similar videos by learning a deep multi-view embedding space. For modeling the emotional view of videos, we incorporate visual sentiment, video content, and text comments into the learning of the embedding space. On a newly collected dataset with over 102K videos and 10.6M comments, we demonstrate that our approach outperforms several state-of-the-art methods and achieves human-level video commenting.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115448761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Online Weighted Clustering for Real-time Abnormal Event Detection in Video Surveillance 视频监控中实时异常事件检测的在线加权聚类
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967279
Hanhe Lin, Jeremiah D. Deng, B. Woodford, Ahmad Shahi
{"title":"Online Weighted Clustering for Real-time Abnormal Event Detection in Video Surveillance","authors":"Hanhe Lin, Jeremiah D. Deng, B. Woodford, Ahmad Shahi","doi":"10.1145/2964284.2967279","DOIUrl":"https://doi.org/10.1145/2964284.2967279","url":null,"abstract":"Detecting abnormal events in video surveillance is a challenging problem due to the large scale, stream fashion video data as well as the real-time constraint. In this paper, we present an online, adaptive, and real-time framework to address this problem. The spatial locations in a frame is partitioned into grids, in each grid the proposed Adaptive Multi-scale Histogram Optical Flow (AMHOF) features are extracted and modelled by an Online Weighted Clustering (OWC) algorithm. The AMHOFs which cannot be fit to a cluster with large weight are regarded as abnormal events. The OWC algorithm is simple to implement and computational efficient. In addition, we improve the detection performance by a Multiple Target Tracking (MTT) algorithm. Experimental results demonstrate our approach outperforms the state-of-the-art approaches in pixel-level rate of detection at a processing speed of 30 FPS.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115461732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
DRIVING: Distributed Scheduling for Video Streaming in Vehicular Wi-Fi Systems 驾驶:车载Wi-Fi系统中视频流的分布式调度
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2964290
X. Chen, Lei Rao, Qiao Xiang, Xue Liu, F. Bai
{"title":"DRIVING: Distributed Scheduling for Video Streaming in Vehicular Wi-Fi Systems","authors":"X. Chen, Lei Rao, Qiao Xiang, Xue Liu, F. Bai","doi":"10.1145/2964284.2964290","DOIUrl":"https://doi.org/10.1145/2964284.2964290","url":null,"abstract":"Video streaming has been dominating the mobile bandwidth, and is still expanding drastically. Its tremendous economic benefits have driven the automobile industry to equip vehicles with video streaming capacity. As a result, the new in-cabin Wi-Fi systems have been deployed, enabling each vehicle as a streaming hotspot on the wheels. A built-in Access Point (AP) bridges the communications between Wi-Fi devices inside and cellular networks outside. Distinct advantages offered by this system include a more powerful antenna array to improve multimedia quality, a constant energy source to power the streaming, etc. However, there exist two challenging features that may jeopardize the system performance. (1) The in-cabin Wi-Fi hotspots are mostly deployed on private vehicles, and thus are completely decentralized. (2) Video packets need to be delivered before their deadlines with small delays. Due to these features, existing algorithms may fail to efficiently schedule the in-cabin Wi-Fi video streaming. To fill the gap, we propose the Delay-awaRe dIstributed Video schedulING (DRIVING) framework. Being fully distributed and delay-aware, DRIVING not only increases the streaming goodput, but also reduces the delivery latency and deadline missing ratio. %In order to optimize this new framework, we establish cross-layer analytical models, which help us tune the framework parameters for better performance. In a typical scenario, DRIVING increases the goodput by up to 27.0%, while reducing the queueing delay and the deadline missing ratio by up to 40.0% and 38.4%, respectively.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117262211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Video eCommerce: Towards Online Video Advertising 视频电子商务:走向在线视频广告
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2964326
Zhi-Qi Cheng, Yang Liu, Xiao Wu, Xiansheng Hua
{"title":"Video eCommerce: Towards Online Video Advertising","authors":"Zhi-Qi Cheng, Yang Liu, Xiao Wu, Xiansheng Hua","doi":"10.1145/2964284.2964326","DOIUrl":"https://doi.org/10.1145/2964284.2964326","url":null,"abstract":"The prevalence of online videos provides an opportunity for e-commerce companies to exhibit their product ads in videos by recommendation. In this paper, we propose an advertising system named Video eCommerce to exhibit appropriate product ads to particular users at proper time stamps of videos, which takes into account video semantics, user shopping preference and viewing behavior feedback by a two-level strategy. At the first level, Co-Relation Regression (CRR) model is novelly proposed to construct the semantic association between keyframes and products. Heterogeneous information network (HIN) is adopted to build the user shopping preference from two different e-commerce platforms, Tmall and MagicBox, which alleviates the problems of data sparsity and cold start. In addition, Video Scene Importance Model (VSIM) utilizes the viewing behavior of users to embed ads at the most attractive position within the video stream. At the second level, taking the results of CRR, HIN and VSIM as the input, Heterogeneous Relation Matrix Factorization (HRMF) is applied for product advertising. Extensive evaluation on a variety of online videos from Tmall MagicBox demonstrates that Video eCommerce achieves promising performance, which significantly outperforms the state-of-the-art advertising methods.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"30 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125803055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
The Lifecycle of Geotagged Multimedia Data 地理标记多媒体数据的生命周期
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2986911
R. Schifanella, B. Thomee
{"title":"The Lifecycle of Geotagged Multimedia Data","authors":"R. Schifanella, B. Thomee","doi":"10.1145/2964284.2986911","DOIUrl":"https://doi.org/10.1145/2964284.2986911","url":null,"abstract":"The world is a big place. At any given instant something is happening somewhere, but even when nothing is going on people still find ways to generate multimedia data, ranging from social media posts, to photos and videos. A substantial number of these media objects is associated with a location, and in an increasingly mobile and connected world (both in terms of people and devices), this number is only bound to get larger. Yet, in the multimedia literature we observe that many researchers often unwittingly treat the geospatial dimension as if it were a regular feature dimension, despite it requiring special attention. In order to avoid pitfalls and to steer clear of erroneous conclusions, this tutorial aims to teach researchers and students how geotagged multimedia data differs from regular data and to educate them on best practices when dealing with such data. We will cover the lifecycle of geotagged data in multimedia research, where the topics range from how this kind of data is represented, processed, analyzed, and visualized. The tutorial requires both passive and active involvement, where we not only present the material, but the attendees also get the opportunity to interact with it using a variety of open source data and tools that we have prepared using a virtual machine.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125052106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Deeply-Supervised Deconvolutional Network for Horizon Line Detection 一种用于地平线检测的深度监督反卷积网络
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967198
L. Porzi, S. R. Bulò, E. Ricci
{"title":"A Deeply-Supervised Deconvolutional Network for Horizon Line Detection","authors":"L. Porzi, S. R. Bulò, E. Ricci","doi":"10.1145/2964284.2967198","DOIUrl":"https://doi.org/10.1145/2964284.2967198","url":null,"abstract":"Automatic skyline detection from mountain pictures is an important task in many applications, such as web image retrieval, augmented reality and autonomous robot navigation. Recent works addressing the problem of Horizon Line Detection (HLD) demonstrated that learning-based boundary detection techniques are more accurate than traditional filtering methods. In this paper we introduce a novel approach for skyline detection, which adheres to a learning-based paradigm and exploits the representation power of deep architectures to improve the horizon line detection accuracy. Differently from previous works, we explore a novel deconvolutional architecture, which introduces intermediate levels of supervision to support the learning process. Our experiments, conducted on a publicly available dataset, confirm that the proposed method outperforms previous learning-based HLD techniques by reducing the number of spurious edge pixels.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128593602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
MatchDR: Image Correspondence by Leveraging Distance Ratio Constraint MatchDR:利用距离比约束的图像对应
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967293
Rui Wang, Dong Liang, Wei Zhang, Xiaochun Cao
{"title":"MatchDR: Image Correspondence by Leveraging Distance Ratio Constraint","authors":"Rui Wang, Dong Liang, Wei Zhang, Xiaochun Cao","doi":"10.1145/2964284.2967293","DOIUrl":"https://doi.org/10.1145/2964284.2967293","url":null,"abstract":"Image correspondence is to establish the connections between coherent images, which can be quite challenging due to the visual and geometric deformations. This paper proposes a robust image correspondence technique from the perspective of spatial regularity. Specifically, the visual deformation is addressed by introducing the spatial information by enforcing the distance ratio constrain. At the same time, the geometric deformation is tolerated by adopting a smoothness term. Subsequently, image correspondence is formulated as permutation problem, for which, we propose a Gradient Guided Simulated Annealing method for robust optimization. Furthermore, our method is much more memory efficient, where the storage complexity is reduced from O(n4) to O(n2). The experiments on several datasets indicate that our proposed formulation and optimization significantly improve the baselines for both visually-similar and semantically-similar images, where both visual and geometric deformations are present.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127105018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Shorter-is-Better: Venue Category Estimation from Micro-Video 越短越好:基于微视频的场地类别估算
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2964307
Jianglong Zhang, Liqiang Nie, Xiang Wang, Xiangnan He, Xianglin Huang, Tat-Seng Chua
{"title":"Shorter-is-Better: Venue Category Estimation from Micro-Video","authors":"Jianglong Zhang, Liqiang Nie, Xiang Wang, Xiangnan He, Xianglin Huang, Tat-Seng Chua","doi":"10.1145/2964284.2964307","DOIUrl":"https://doi.org/10.1145/2964284.2964307","url":null,"abstract":"According to our statistics on over 2 million micro-videos, only 1.22% of them are associated with venue information, which greatly hinders the location-oriented applications and personalized services. To alleviate this problem, we aim to label the bite-sized video clips with venue categories. It is, however, nontrivial due to three reasons: 1) no available benchmark dataset; 2) insufficient information, low quality, and 3) information loss; and 3) complex relatedness among venue categories. Towards this end, we propose a scheme comprising of two components. In particular, we first crawl a representative set of micro-videos from Vine and extract a rich set of features from textual, visual and acoustic modalities. We then, in the second component, build a tree-guided multi-task multi-modal learning model to estimate the venue category for each unseen micro-video. This model is able to jointly learn a common space from multi-modalities and leverage the predefined Foursquare hierarchical structure to regularize the relatedness among venue categories. Extensive experiments have well-validated our model. As a side research contribution, we have released our data, codes and involved parameters.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127213510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信