Proceedings of the 21st ACM international conference on Multimedia最新文献

筛选
英文 中文
Online multimodal deep similarity learning with application to image retrieval 在线多模态深度相似学习及其在图像检索中的应用
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502112
Pengcheng Wu, S. Hoi, Hao Xia, P. Zhao, Dayong Wang, C. Miao
{"title":"Online multimodal deep similarity learning with application to image retrieval","authors":"Pengcheng Wu, S. Hoi, Hao Xia, P. Zhao, Dayong Wang, C. Miao","doi":"10.1145/2502081.2502112","DOIUrl":"https://doi.org/10.1145/2502081.2502112","url":null,"abstract":"Recent years have witnessed extensive studies on distance metric learning (DML) for improving similarity search in multimedia information retrieval tasks. Despite their successes, most existing DML methods suffer from two critical limitations: (i) they typically attempt to learn a linear distance function on the input feature space, in which the assumption of linearity limits their capacity of measuring the similarity on complex patterns in real-world applications; (ii) they are often designed for learning distance metrics on uni-modal data, which may not effectively handle the similarity measures for multimedia objects with multimodal representations. To address these limitations, in this paper, we propose a novel framework of online multimodal deep similarity learning (OMDSL), which aims to optimally integrate multiple deep neural networks pretrained with stacked denoising autoencoder. In particular, the proposed framework explores a unified two-stage online learning scheme that consists of (i) learning a flexible nonlinear transformation function for each individual modality, and (ii) learning to find the optimal combination of multiple diverse modalities simultaneously in a coherent process. We conduct an extensive set of experiments to evaluate the performance of the proposed algorithms for multimodal image retrieval tasks, in which the encouraging results validate the effectiveness of the proposed technique.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86132257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 169
Stereotime: a wireless 2D and 3D switchable video communication system 立体时间:一个无线2D和3D切换视频通信系统
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502275
You Yang, Qiong Liu, Yue Gao, Binbin Xiong, Li Yu, Huanbo Luan, R. Ji, Q. Tian
{"title":"Stereotime: a wireless 2D and 3D switchable video communication system","authors":"You Yang, Qiong Liu, Yue Gao, Binbin Xiong, Li Yu, Huanbo Luan, R. Ji, Q. Tian","doi":"10.1145/2502081.2502275","DOIUrl":"https://doi.org/10.1145/2502081.2502275","url":null,"abstract":"Mobile 3D video communication, especially with 2D and 3D compatible, is a new paradigm for both video communication and 3D video processing. Current techniques face challenges in mobile devices when bundled constraints such as computation resource and compatibility should be considered. In this work, we present a wireless 2D and 3D switchable video communication to handle the previous challenges, and name it as Stereotime. The methods of Zig-Zag fast object segmentation, depth cues detection and merging, and texture-adaptive view generation are used for 3D scene reconstruction. We show the functionalities and compatibilities on 3D mobile devices in WiFi network environment.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88400266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Jiku director: a mobile video mashup system Jiku导演:一个移动视频混搭系统
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502277
Duong-Trung-Dung Nguyen, M. Saini, Vu-Thanh Nguyen, Wei Tsang Ooi
{"title":"Jiku director: a mobile video mashup system","authors":"Duong-Trung-Dung Nguyen, M. Saini, Vu-Thanh Nguyen, Wei Tsang Ooi","doi":"10.1145/2502081.2502277","DOIUrl":"https://doi.org/10.1145/2502081.2502277","url":null,"abstract":"In this technical demonstration, we demonstrate a Web-based application called Jiku Director that automatically creates a mashup video from event videos uploaded by users. The system runs an algorithm that considers view quality (shakiness, tilt, occlusion), video quality (blockiness, contrast, sharpness, illumination, burned pixels), and spatial-temporal diversity (shot angles, shot lengths) to create a mashup video with smooth shot transitions while covering the event from different perspectives.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91551399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Action recognition using invariant features under unexampled viewing conditions 在未示例的观看条件下使用不变特征的动作识别
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2508126
Litian Sun, K. Aizawa
{"title":"Action recognition using invariant features under unexampled viewing conditions","authors":"Litian Sun, K. Aizawa","doi":"10.1145/2502081.2508126","DOIUrl":"https://doi.org/10.1145/2502081.2508126","url":null,"abstract":"A great challenge in real-world applications of action recognition is the lack of sufficient label information because of variance in the recording viewpoint and differences between individuals. A system that can adapt itself according to these variances is required for practical use. We present a generic method for extracting view-invariant features from skeleton joints. These view-invariant features are further refined using a stacked, compact autoencoder. To model the challenge of real-world applications, two unexampled test settings (NewView and NewPerson) are used to evaluate the proposed method. Experimental results with these test settings demonstrate the effectiveness of our method.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91104638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content SentiBank:用于检测视觉内容中的情绪和情绪的大规模本体和分类器
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502268
Damian Borth, Tao Chen, R. Ji, Shih-Fu Chang
{"title":"SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content","authors":"Damian Borth, Tao Chen, R. Ji, Shih-Fu Chang","doi":"10.1145/2502081.2502268","DOIUrl":"https://doi.org/10.1145/2502081.2502268","url":null,"abstract":"A picture is worth one thousand words, but what words should be used to describe the sentiment and emotions conveyed in the increasingly popular social multimedia? We demonstrate a novel system which combines sound structures from psychology and the folksonomy extracted from social multimedia to develop a large visual sentiment ontology consisting of 1,200 concepts and associated classifiers called SentiBank. Each concept, defined as an Adjective Noun Pair (ANP), is made of an adjective strongly indicating emotions and a noun corresponding to objects or scenes that have a reasonable prospect of automatic detection. We believe such large-scale visual classifiers offer a powerful mid-level semantic representation enabling high-level sentiment analysis of social multimedia. We demonstrate novel applications made possible by SentiBank including live sentiment prediction of social media and visualization of visual content in a rich intuitive semantic space.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81663418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 190
Multimedia information retrieval: music and audio 多媒体信息检索:音乐和音频
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502237
M. Schedl, E. Gómez, Masataka Goto
{"title":"Multimedia information retrieval: music and audio","authors":"M. Schedl, E. Gómez, Masataka Goto","doi":"10.1145/2502081.2502237","DOIUrl":"https://doi.org/10.1145/2502081.2502237","url":null,"abstract":"Music is an omnipresent topic in our daily lives, as almost everyone enjoys listening to his or her favorite tunes. Music information retrieval (MIR) is a research field that aims – among other things – at automatically extracting semantically meaningful information from various representations of music entities, such as a digital audio file, a band’s web page, a song’s lyrics, or a tweet about a microblogger’s current listening activity. A key approach in MIR is to describe music via computational features, which can be categorized into: music content, music context, and user context. The music content refers to features extracted from the audio signal, while information about musical entities not encoded in the signal (e.g., image of an artist or political background of a song) are referred to as music context. The user context, in contrast, includes environmental aspects as well as physical and mental activities of the music listener. MIR research has been seeing a paradigm shift over the last couple of years, as an increasing number of recent approaches and commercial technologies combine content-based techniques (focusing on the audio signal) with multimedia context data mined, e.g. from web sources and with user context information.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85130970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Moment feature based forensic detection of resampled digital images 基于矩特征的重采样数字图像法医检测
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502150
Lu Li, Jianru Xue, Zhiqiang Tian, Nanning Zheng
{"title":"Moment feature based forensic detection of resampled digital images","authors":"Lu Li, Jianru Xue, Zhiqiang Tian, Nanning Zheng","doi":"10.1145/2502081.2502150","DOIUrl":"https://doi.org/10.1145/2502081.2502150","url":null,"abstract":"Forensic detection of resampled digital images has become an important technology among many others to establish the integrity of digital visual content. This paper proposes a moment feature based method to detect resampled digital images. Rather than concentrating on the positions of characteristic resampling peaks, we utilize a moment feature to exploit the periodic interpolation characteristics in the frequency domain. Not only the positions of resampling peaks but also the amplitude distribution is taken into consideration. With the extracted moment feature, a trained SVM classifier is used to detect resampled digital images. Extensive experimental results show the validity and efficiency of the proposed method.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90867690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Session details: Scene understanding 会话细节:场景理解
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/3245301
D. Joshi
{"title":"Session details: Scene understanding","authors":"D. Joshi","doi":"10.1145/3245301","DOIUrl":"https://doi.org/10.1145/3245301","url":null,"abstract":"","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84427045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Security and forensics 会话详细信息:安全性和取证
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/3245296
R. Cucchiara
{"title":"Session details: Security and forensics","authors":"R. Cucchiara","doi":"10.1145/3245296","DOIUrl":"https://doi.org/10.1145/3245296","url":null,"abstract":"","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84427720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Golden retriever: a Java based open source image retrieval engine 一个基于Java的开源图像检索引擎
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502227
Lazaros Tsochatzidis, C. Iakovidou, S. Chatzichristofis, Y. Boutalis
{"title":"Golden retriever: a Java based open source image retrieval engine","authors":"Lazaros Tsochatzidis, C. Iakovidou, S. Chatzichristofis, Y. Boutalis","doi":"10.1145/2502081.2502227","DOIUrl":"https://doi.org/10.1145/2502081.2502227","url":null,"abstract":"Golden Retriever Image Retrieval Engine (GRire) is an open source light weight Java library developed for Content Based Image Retrieval (CBIR) tasks, employing the Bag of Visual Words (BOVW) model. It provides a complete framework for creating CBIR system including image analysis tools, classifiers, weighting schemes etc., for efficient indexing and retrieval procedures. Its eminent feature is its extensibility, achieved through the open source nature of the library as well as a user-friendly embedded plug-in system. GRire is available on-line along with install and development documentation on http://www.grire.net and on its Google Code page http://code.google.com/p/grire. It is distributed either as a Java library or as a standalone Java application, both GPL licensed.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86591002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信