Proceedings of the 24th ACM international conference on Multimedia最新文献

筛选
英文 中文
A Discriminative and Compact Audio Representation for Event Detection 一种用于事件检测的判别压缩音频表示
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2970377
L. Jing, Bo Liu, Jaeyoung Choi, Adam L. Janin, Julia Bernd, Michael W. Mahoney, G. Friedland
{"title":"A Discriminative and Compact Audio Representation for Event Detection","authors":"L. Jing, Bo Liu, Jaeyoung Choi, Adam L. Janin, Julia Bernd, Michael W. Mahoney, G. Friedland","doi":"10.1145/2964284.2970377","DOIUrl":"https://doi.org/10.1145/2964284.2970377","url":null,"abstract":"This paper presents a novel two-phase method for audio representation: Discriminative and Compact Audio Representation (DCAR). In the first phase, each audio track is modeled using a Gaussian mixture model (GMM) that includes several components to capture the variability within that track. The second phase takes into account both global structure and local structure. In this phase, the components are rendered more discriminative and compact by formulating an optimization problem on Grassmannian manifolds, which we found represents the structure of audio effectively. Experimental results on the YLI-MED dataset show that the proposed DCAR representation consistently outperforms state-of-the-art audio representations: i-vector, mv-vector, and GMM.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124483007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
SuperStreamer SuperStreamer
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2973827
Yong Xue Eu, Jermyn Tanu, Justin Jieting Law, Muhammad Hanif B Ghazali, Shuan Siang Tay, Wei Tsang Ooi, A. Bhojan
{"title":"SuperStreamer","authors":"Yong Xue Eu, Jermyn Tanu, Justin Jieting Law, Muhammad Hanif B Ghazali, Shuan Siang Tay, Wei Tsang Ooi, A. Bhojan","doi":"10.1145/2964284.2973827","DOIUrl":"https://doi.org/10.1145/2964284.2973827","url":null,"abstract":"This technical demonstration presents the SuperStreamer project, which enables progressive game assets streaming to players while games are played, reducing the startup time required to download and start playing a cloud-based game. SuperStreamer modifies a popular game engine, Unreal Engine 4, to support developing and playing games with progressive game asset streaming. With SuperStreamer, developers can mark the minimal set of files, containing only the game content essential to start playing the game. When a player plays the game, these minimal set of files will be downloaded to the player's device. SuperStreamer also generates low resolution textures automatically when a developer publishes a game, and these low resolution textures are transmitted first into the game client. As players move through a game level, high quality textures required for the game will be downloaded. In our demo game, we are able to decrease the time taken to startup and load the first game level by around 30%.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121634277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
High-speed Depth Stream Generation from a Hybrid Camera 混合相机的高速深度流生成
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2964305
X. Zuo, Sen Wang, Jiangbin Zheng, Ruigang Yang
{"title":"High-speed Depth Stream Generation from a Hybrid Camera","authors":"X. Zuo, Sen Wang, Jiangbin Zheng, Ruigang Yang","doi":"10.1145/2964284.2964305","DOIUrl":"https://doi.org/10.1145/2964284.2964305","url":null,"abstract":"High-speed video has been commonly adopted in consumer-grade cameras, augmenting these videos with a corresponding depth stream will enable new multimedia applications, such as 3D slow-motion video. In this paper, we present a hybrid camera system that combines a high-speed color camera with a depth sensor, e.g. Kinect depth sensor, to generate a depth stream that can produce both high-speed and high-resolution RGB+depth stream. Simply interpolating the low-speed depth frames is not satisfactory, where interpolation artifacts and lose in surface details are often visible. We have developed a novel framework that utilizes both shading constraints within each frame and optical flow constraints between neighboring frames. More specifically we present (a) an effective method to find the intrinsics images to allow more accurate normal estimation; and (b) an optimization-based framework to estimate the high-resolution/high-speed depth stream, taking into consideration temporal smoothness and shading/depth consistency. We evaluated our holistic framework with both synthetic and real sequences, it showed superior performance than previous state-of-the-art.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132051057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model 微观告诉宏观:通过转换模型预测微视频的流行
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2964314
Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, Tat-Seng Chua
{"title":"Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model","authors":"Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, Tat-Seng Chua","doi":"10.1145/2964284.2964314","DOIUrl":"https://doi.org/10.1145/2964284.2964314","url":null,"abstract":"Micro-videos, a new form of user generated contents (UGCs), are gaining increasing enthusiasm. Popular micro-videos have enormous commercial potential in many ways, such as online marketing and brand tracking. In fact, the popularity prediction of traditional UGCs including tweets, web images, and long videos, has achieved good theoretical underpinnings and great practical success. However, little research has thus far been conducted to predict the popularity of the bite-sized videos. This task is non-trivial due to three reasons: 1) micro-videos are short in duration and of low quality; 2) they can be described by multiple heterogeneous channels, spanning from social, visual, acoustic to textual modalities; and 3) there are no available benchmark dataset and discriminant features that are suitable for this task. Towards this end, we present a transductive multi-modal learning model. The proposed model is designed to find the optimal latent common space, unifying and preserving information from different modalities, whereby micro-videos can be better represented. This latent space can be used to alleviate the information insufficiency problem caused by the brief nature of micro-videos. In addition, we built a benchmark dataset and extracted a rich set of popularity-oriented features to characterize the popular micro-videos. Extensive experiments have demonstrated the effectiveness of the proposed model. As a side contribution, we have released the dataset, codes and parameters to facilitate other researchers.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132247821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 131
HEVC-compliant Tile-based Streaming of Panoramic Video for Virtual Reality Applications hevc兼容的基于tile的全景视频流用于虚拟现实应用
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967292
Alireza Zare, A. Aminlou, M. Hannuksela, M. Gabbouj
{"title":"HEVC-compliant Tile-based Streaming of Panoramic Video for Virtual Reality Applications","authors":"Alireza Zare, A. Aminlou, M. Hannuksela, M. Gabbouj","doi":"10.1145/2964284.2967292","DOIUrl":"https://doi.org/10.1145/2964284.2967292","url":null,"abstract":"Delivering wide-angle and high-resolution spherical panoramic video content entails a high streaming bitrate. This imposes challenges when panorama clips are consumed in virtual reality (VR) head-mounted displays (HMD). The reason is that the HMDs typically require high spatial and temporal fidelity contents and strict low-latency in order to guarantee the user's sense of presence while using them. In order to alleviate the problem, we propose to store two versions of the same video content at different resolutions, each divided into multiple tiles using the High Efficiency Video Coding (HEVC) standard. According to the user's present viewport, a set of tiles is transmitted in the highest captured resolution, while the remaining parts are transmitted from the low-resolution version of the same content. In order to enable randomly choosing different combinations, the tile sets are encoded to be independently decodable. We further study the trade-off in the choice of tiling scheme and its impact on compression and streaming bitrate performances. The results indicate streaming bitrate saving from 30% to 40%, depending on the selected tiling scheme, when compared to streaming the entire video content.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131466079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 179
WorkCache: Salvaging siloed knowledge 工作缓存:挖掘孤立的知识
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2973809
S. Carter, Laurent Denoue, Matthew L. Cooper
{"title":"WorkCache: Salvaging siloed knowledge","authors":"S. Carter, Laurent Denoue, Matthew L. Cooper","doi":"10.1145/2964284.2973809","DOIUrl":"https://doi.org/10.1145/2964284.2973809","url":null,"abstract":"The proliferation of workplace multimedia collaboration applications has meant on one hand more opportunities for group work but on the other more data locked away in proprietary interfaces. We are developing new tools to capture and access multimedia content from any source. In this demo, we focus primarily on new methods that allow users to rapidly reconstitute, enhance, and share document-based information.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132607746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MP3DG-PCC, Open Source Software Framework for Implementation and Evaluation of Point Cloud Compression MP3DG-PCC,点云压缩实现与评估的开源软件框架
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2973806
R. Mekuria, Pablo César
{"title":"MP3DG-PCC, Open Source Software Framework for Implementation and Evaluation of Point Cloud Compression","authors":"R. Mekuria, Pablo César","doi":"10.1145/2964284.2973806","DOIUrl":"https://doi.org/10.1145/2964284.2973806","url":null,"abstract":"We present MP3DG-PCC, an open source framework for design, implementation and evaluation of point cloud compression algorithms. The framework includes objective quality metrics, lossy and lossless anchor codecs, and a test bench for consistent comparative evaluation. The framework and proposed methodology is in use for the development of an international point cloud compression standard in MPEG. In addition, the library is integrated with the popular point cloud library, making a large number of point cloud processing available and aligning the work with the broader open source community.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130859821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Key Color Generation for Affective Multimedia Production: An Initial Method and Its Application 情感多媒体制作的关键色彩生成:一种初步方法及其应用
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2964323
Eunjin Kim, Hyeon‐Jeong Suk
{"title":"Key Color Generation for Affective Multimedia Production: An Initial Method and Its Application","authors":"Eunjin Kim, Hyeon‐Jeong Suk","doi":"10.1145/2964284.2964323","DOIUrl":"https://doi.org/10.1145/2964284.2964323","url":null,"abstract":"In this paper, we introduce a method that generates a key color to construct an aesthetic and affective harmony with visual content. Given an image and an affective term, our method creates a key color by combining a dominant hue of the image and a unique tone associated with the affective word. To match each affective term with a specific tone, we collected color themes from a crowd-sourced database and identified the most popular tone of color themes that are relevant to each affective term. The results of a user test showed that the method generates satisfactory key colors as much as designers do. Finally, as a prospective application, we employed our method to a promotional video editing prototype. Our method automatically generates a key color based on a frame of an input video and apply the color to a shape that delivers a promotional message. A second user study verifies that the video editing prototype with our method can effectively deliver the desired affective state with a satisfactory quality.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133513591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Deep Correlation Features for Image Style Classification 图像样式分类的深度相关特征
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967251
W. Chu, Yi-Ling Wu
{"title":"Deep Correlation Features for Image Style Classification","authors":"W. Chu, Yi-Ling Wu","doi":"10.1145/2964284.2967251","DOIUrl":"https://doi.org/10.1145/2964284.2967251","url":null,"abstract":"This paper presents a comprehensive study of deep correlation features on image style classification. Inspired by that correlation between feature maps can effectively describe image texture, we design and transform various such correlations into style vectors, and investigate classification performance brought by different variants. In addition to intra-layer correlation, we also propose inter-layer correlation and verify its benefit. Through extensive experiments on image style classification and artist classification, we demonstrate that the proposed style vectors significantly outperforms CNN features coming from fully-connected layers, as well as outperforms the state-of-the-art deep representation.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133301763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
A Multi-Video Browser for Endoscopic Videos on Tablets 用于平板电脑内窥镜视频的多视频浏览器
Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2973821
Marco A. Hudelist, Sabrina Kletz, Klaus Schöffmann
{"title":"A Multi-Video Browser for Endoscopic Videos on Tablets","authors":"Marco A. Hudelist, Sabrina Kletz, Klaus Schöffmann","doi":"10.1145/2964284.2973821","DOIUrl":"https://doi.org/10.1145/2964284.2973821","url":null,"abstract":"We present a browser for endoscopic videos that is designed to easily navigate and compare scenes on a tablet. It utilizes frame stripes of different levels of detail to quickly switch between fast and detailed navigation. Moreover, it uses saliency methods to determine which areas of a given keyframe contain the most information to further improve the visualization of the frame stripes. As scenes with much movement can be non-relevant out-of-patient scenes, the tool supports filtering for scenes of low, medium and high motion. The tool can be especially useful for patient debriefings as well as for educational purposes.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"9 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132691698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信