Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval最新文献

筛选
英文 中文
Family Photo Recognition via Multiple Instance Learning 基于多实例学习的全家福识别
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3079036
Junkang Zhang, Siyu Xia, Ming Shao, Y. Fu
{"title":"Family Photo Recognition via Multiple Instance Learning","authors":"Junkang Zhang, Siyu Xia, Ming Shao, Y. Fu","doi":"10.1145/3078971.3079036","DOIUrl":"https://doi.org/10.1145/3078971.3079036","url":null,"abstract":"Family photo recognition is an important task in social media analytics. Previous methods use singleton global features and conventional binary classifiers to distinguish family group photos from non-family ones. Different from them, we propose a novel family recognition approach with three dedicated local representations under Multiple Instance Learning framework, where geometry, kinship and semantic features are integrated to overcome issues in the previous work. Experimental results show that our method achieves the state-of-the-art result among global-feature models.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"320 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122707197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Visually Browsing Millions of Images Using Image Graphs 视觉浏览数以百万计的图像使用图像图形
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3079016
K. U. Barthel, N. Hezel, K. Jung
{"title":"Visually Browsing Millions of Images Using Image Graphs","authors":"K. U. Barthel, N. Hezel, K. Jung","doi":"10.1145/3078971.3079016","DOIUrl":"https://doi.org/10.1145/3078971.3079016","url":null,"abstract":"We present a new approach to visually browse very large sets of untagged images. High quality image features are generated using transformed activations of a convolutional neural network. These features are used to model image similarities, from which a hierarchical image graph is build. We show how such a graph can be constructed efficiently. In our experiments we found best user experience for navigating the graph is achieved by projecting sub-graphs onto a regular 2D image map. This allows users to explore the image collection like an interactive map.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124828419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Deep Sentiment Features of Context and Faces for Affective Video Analysis 情感视频分析中语境与面孔的深层情感特征
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3079027
C. Baecchi, Tiberio Uricchio, M. Bertini, A. Bimbo
{"title":"Deep Sentiment Features of Context and Faces for Affective Video Analysis","authors":"C. Baecchi, Tiberio Uricchio, M. Bertini, A. Bimbo","doi":"10.1145/3078971.3079027","DOIUrl":"https://doi.org/10.1145/3078971.3079027","url":null,"abstract":"Given the huge quantity of hours of video available on video sharing platforms such as YouTube, Vimeo, etc. development of automatic tools that help users find videos that fit their interests has attracted the attention of both scientific and industrial communities. So far the majority of the works have addressed semantic analysis, to identify objects, scenes and events depicted in videos, but more recently affective analysis of videos has started to gain more attention. In this work we investigate the use of sentiment driven features to classify the induced sentiment of a video, i.e. the sentiment reaction of the user. Instead of using standard computer vision features such as CNN features or SIFT features trained to recognize objects and scenes, we exploit sentiment related features such as the ones provided by Deep-SentiBank, and features extracted from models that exploit deep networks trained on face expressions. We experiment on two recently introduced datasets: LIRIS-ACCEDE and MEDIAEVAL-2015, that provide sentiment annotations of a large set of short videos. We show that our approach not only outperforms the current state-of-the-art in terms of valence and arousal classification accuracy, but it also uses a smaller number of features, requiring thus less video processing.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127792744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Finger Vein Image Retrieval via Coding Scale-varied Superpixel Feature 基于编码尺度变化的超像素特征的手指静脉图像检索
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3078975
Kuikui Wang, Lu Yang, Gongping Yang, Xin Luo, Kun Su, Yilong Yin
{"title":"Finger Vein Image Retrieval via Coding Scale-varied Superpixel Feature","authors":"Kuikui Wang, Lu Yang, Gongping Yang, Xin Luo, Kun Su, Yilong Yin","doi":"10.1145/3078971.3078975","DOIUrl":"https://doi.org/10.1145/3078971.3078975","url":null,"abstract":"Finger vein image retrieval is one significant technique for performing fast identification especially in large-scale applications. However, most existing retrieval methods were based on fixed-scale feature of non-overlapped rectangular image block, in which the representation ability of feature and the local consistency of vein pattern were both overlooked. And the weak encoding (e.g., predefined threshold based binarization) was also limited the retrieval performance. Focusing on these problems, this paper proposes a novel finger vein image retrieval framework based on similarity-preserving encoding of scale-varied superpixel feature. In the framework, locally consistent pixels in one superpixel are used as a unit of feature representation, and the feature length is varied with the category of the superpixel classified by the variance of lowest dimensional feature. Additionally, the feature compaction and feature rotation based encoding can minimize the quantization loss and preserve the similarity between the scale-varied feature and the encoded binary codes. Experimental results on six public finger vein databases demonstrate that the superiority of the proposed coding scale-varied superpixel feature based retrieval approach over the state-of-the-arts.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124510042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Intelligently Connecting People with Information 用信息智能地连接人
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3081371
Changhu Wang
{"title":"Intelligently Connecting People with Information","authors":"Changhu Wang","doi":"10.1145/3078971.3081371","DOIUrl":"https://doi.org/10.1145/3078971.3081371","url":null,"abstract":"How to effectively connect people with information is a fundamental problem in human society. We are now in the era of mobile first, and everything is digitally connected. With the advent of diverse social contents, information feeds have become a new way to connect people with information. Thus, there is a pretty good opportunity for artificial intelligence (AI) to make innovations in this direction. AI can make more efficient and intelligent the creation, moderation, dissemination, searching, consumption, and interaction of information and contents. As an industry leader in the product platform and service of information feeds, Toutiao takes the lead to develop and leverage diverse machine learning techniques to efficiently process, analyze, mine, understand, and organize a large amount of multimedia data. Meanwhile, owning to its rich application scenarios and active users all over the world, we have accumulated huge amount of training data, which makes the machine learning system form a closed feedback loop and thus can continually improve and evolve itself. This closed-loop system enables Toutiao to develop core AI technologies in large-scale machine learning, text analysis, natural language processing, computer vision, and data mining. In this talk, I will share some personal opinions to the development prospects of AI in this fundamental area, including my understanding to AI, important research progress in recent years, the influence of AI to the software industry, and how to build the core competence strategy of AI in a company. Moreover, I will also introduce some research progress of Toutiao AI Lab.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116702096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Unsupervised Distance Learning Framework for Multimedia Retrieval 多媒体检索的无监督远程学习框架
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3079017
Lucas Pascotti Valem, D. C. G. Pedronette
{"title":"An Unsupervised Distance Learning Framework for Multimedia Retrieval","authors":"Lucas Pascotti Valem, D. C. G. Pedronette","doi":"10.1145/3078971.3079017","DOIUrl":"https://doi.org/10.1145/3078971.3079017","url":null,"abstract":"Due to the increasing availability of image and multimedia collections, unsupervised post-processing methods, which are capable of improving the effectiveness of retrieval results without the need of user intervention, have become indispensable. This paper presents the Unsupervised Distance Learning Framework (UDLF), a software which enables an easy use and evaluation of unsupervised learning methods. The framework defines a broad model, allowing the implementation of different unsupervised methods and supporting diverse file formats for input and output. Seven different unsupervised methods are initially available in the framework. Executions and experiments can be easily defined by setting a configuration file. The framework also includes the evaluation of the retrieval results exporting visual output results, computing effectiveness and efficiency measures. The source-code is public available, such that anyone can freely access, use, change, and share the software under the terms of the GPLv2 license.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132559220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Information Retrieval from Multi-Sensor Data for Enriching Location Services at HERE Technologies 从多传感器数据中提取信息以丰富HERE技术的定位服务
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3081370
Matei Stroila
{"title":"Information Retrieval from Multi-Sensor Data for Enriching Location Services at HERE Technologies","authors":"Matei Stroila","doi":"10.1145/3078971.3081370","DOIUrl":"https://doi.org/10.1145/3078971.3081370","url":null,"abstract":"HERE Technologies provides real-time location services that enable people, enterprises, and cities around the world to harness the power of location and create innovative solutions for a safer and more efficient living. Multimedia retrieval techniques and sensor fusion approaches are essential for enriching location services and for keeping the underlying map up to date. In this talk, I will give an overview of some of the work we do in the CTO Research group to support existing location services and enable future ones. We aim to automatically extract useful information from massive collections of images, LiDAR point clouds, car sensor data and open web data. I will present work related to image recognition for map making purposes, information retrieval for points of interest enrichment, and work related to creating a highly accurate map of the roads and cities for the future autonomous navigation services.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133245304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Tutorials 会议详情:教程
G. Awad
{"title":"Session details: Tutorials","authors":"G. Awad","doi":"10.1145/3254614","DOIUrl":"https://doi.org/10.1145/3254614","url":null,"abstract":"","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115663355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Image Classification using Coarse and Fine Labels 改进图像分类的粗标签和细标签
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3079042
Anuvabh Dutt, D. Pellerin, G. Quénot
{"title":"Improving Image Classification using Coarse and Fine Labels","authors":"Anuvabh Dutt, D. Pellerin, G. Quénot","doi":"10.1145/3078971.3079042","DOIUrl":"https://doi.org/10.1145/3078971.3079042","url":null,"abstract":"The performance of classifiers is in general improved by designing models with a large number of parameters or by ensembles. We tackle the problem of classification of coarse and fine grained categories, which share a semantic relationship. On being given the predictions that a classifier has for a given test sample, we adjust the probabilities according to the semantics of the categories, on which the classifier was trained. We present an algorithm for doing such an adjustment and we demonstrate improvement for both coarse and fine grained classification. We evaluate our method using convolutional neural networks. However, the algorithm can be applied to any classifier which outputs category wise probabilities.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116589690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Musical Instrument Recognition in User-generated Videos using a Multimodal Convolutional Neural Network Architecture 基于多模态卷积神经网络的用户生成视频中的乐器识别
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3079002
Olga Slizovskaia, E. Gómez, G. Haro
{"title":"Musical Instrument Recognition in User-generated Videos using a Multimodal Convolutional Neural Network Architecture","authors":"Olga Slizovskaia, E. Gómez, G. Haro","doi":"10.1145/3078971.3079002","DOIUrl":"https://doi.org/10.1145/3078971.3079002","url":null,"abstract":"This paper presents a method for recognizing musical instruments in user-generated videos. Musical instrument recognition from music signals is a well-known task in the music information retrieval (MIR) field, where current approaches rely on the analysis of the good-quality audio material. This work addresses a real-world scenario with several research challenges, i.e. the analysis of user-generated videos that are varied in terms of recording conditions and quality and may contain multiple instruments sounding simultaneously and background noise. Our approach does not only focus on the analysis of audio information, but we exploit the multimodal information embedded in the audio and visual domains. In order to do so, we develop a Convolutional Neural Network (CNN) architecture which combines learned representations from both modalities at a late fusion stage. Our approach is trained and evaluated on two large-scale video datasets: YouTube-8M and FCVID. The proposed architectures demonstrate state-of-the-art results in audio and video object recognition, provide additional robustness to missing modalities, and remains computationally cheap to train.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126228058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信