Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval最新文献

筛选
英文 中文
Balanced Search Space Partitioning for Distributed Media Redundant Indexing 分布式媒体冗余索引均衡搜索空间分区
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3078999
André Mourão, João Magalhães
{"title":"Balanced Search Space Partitioning for Distributed Media Redundant Indexing","authors":"André Mourão, João Magalhães","doi":"10.1145/3078971.3078999","DOIUrl":"https://doi.org/10.1145/3078971.3078999","url":null,"abstract":"This paper addresses the problem of balanced, redundant indexing of media information. Our goal is to partition and distribute the search index, taking advantage of the distributed systems properties: balanced load across nodes, redundancy on node down and efficient node usage under concurrent querying. We follow an information compression approach to solve this problem and propose to represent data with overcomplete codebooks, where each document is represented by only a few codewords and an indexing node is responsible for several codewords. Quantization algorithms are designed to fit the original data as best as possible, leading to bias towards codewords that fit the principal directions of data. In this paper, we propose the balanced KSVD (B-KSVD) algorithm, that distributes the allocation of data across a balanced number of codewords, according to the global distribution of data. Indexing experiments showed that B-KSVD can achieve 38% 1-recall by inspecting only 1% of the full index, distributed over 10 partitions. Traditional methods based on k-means need to either use larger codebooks or to inspect a larger portion of the index to achieve the same retrieval performance.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"581 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123209039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
LireSolr: A Visual Information Retrieval Server LireSolr:一个可视化信息检索服务器
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3079014
M. Lux, M. Riegler, P. Halvorsen, Glenn Macstravic
{"title":"LireSolr: A Visual Information Retrieval Server","authors":"M. Lux, M. Riegler, P. Halvorsen, Glenn Macstravic","doi":"10.1145/3078971.3079014","DOIUrl":"https://doi.org/10.1145/3078971.3079014","url":null,"abstract":"In this paper, we present LireSolr, an open source image retrieval server, build on top of the LIRE library and the Apache Solr search server. With LireSolr, visual information retrieval can be run on a server, which allows better distribution of workloads and simplifies applications in several areas including mobile and web. Furthermore, we showcase several example scenarios how LireSolr can be used to point out the broad range of possibilities and applications. The system is easy to install and setup, and the large number of retrieval tools either provided by LIRE or by other Apache Solr is made easily available on the search server. Moreover, our tool demonstrates how predictions from CNNs can easily be used to extend the visual information retrieval functionality.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126781728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Groups Re-identification with Temporal Context 时间背景下的群体再认同
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3078978
Michal Koperski, Sławomir Bąk, Peter Carr
{"title":"Groups Re-identification with Temporal Context","authors":"Michal Koperski, Sławomir Bąk, Peter Carr","doi":"10.1145/3078971.3078978","DOIUrl":"https://doi.org/10.1145/3078971.3078978","url":null,"abstract":"Re-identification methods often require well aligned, unoccluded detections of an entire subject. Such assumptions are impractical in real world scenarios, where people tend to form groups. To circumvent poor detection performance caused by occlusions, we use fixed regions of interest and employ codebook-based visual representations. We account for illumination variations between cameras using a coupled clustering method that learns per-camera codebooks with entries that correspond across cameras. Because predictable movement patterns exist in many scenarios, we also incorporate temporal context to improve re-identification performance. This includes learning expected travel times directly from data and using mutual exclusion constraints to encourage solutions that maintain temporal ordering. Our experiments illustrate the merits of the proposed approach in challenging re-identification scenarios including crowded public spaces.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131181267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Session details: Special Oral Session: Identifying and Linking Interesting Content in Large Audiovisual Repositories 特别口头会议:识别和链接大型视听资源库中的有趣内容
Maria Eskevich, R. Ordelman
{"title":"Session details: Special Oral Session: Identifying and Linking Interesting Content in Large Audiovisual Repositories","authors":"Maria Eskevich, R. Ordelman","doi":"10.1145/3254625","DOIUrl":"https://doi.org/10.1145/3254625","url":null,"abstract":"","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134437905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating Video Descriptions with Topic Guidance 生成视频描述与主题指导
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3079000
Shizhe Chen, Jia Chen, Qin Jin
{"title":"Generating Video Descriptions with Topic Guidance","authors":"Shizhe Chen, Jia Chen, Qin Jin","doi":"10.1145/3078971.3079000","DOIUrl":"https://doi.org/10.1145/3078971.3079000","url":null,"abstract":"Generating video descriptions in natural language (a.k.a. video captioning) is a more challenging task than image captioning as the videos are intrinsically more complicated than images in two aspects. First, videos cover a broader range of topics, such as news, music, sports and so on. Second, multiple topics could coexist in the same video. In this paper, we propose a novel caption model, topic-guided model (TGM), to generate topic-oriented descriptions for videos in the wild via exploiting topic information. In addition to predefined topics, i.e., category tags crawled from the web, we also mine topics in a data-driven way based on training captions by an unsupervised topic mining model. We show that data-driven topics reflect a better topic schema than the predefined topics. As for testing video topic prediction, we treat the topic mining model as teacher to train the student, the topic prediction model, by utilizing the full multi-modalities in the video especially the speech modality. We propose a series of caption models to exploit topic guidance, including implicitly using the topics as input features to generate words related to the topic and explicitly modifying the weights in the decoder with topics to function as an ensemble of topic-aware language decoders. Our comprehensive experimental results on the current largest video caption dataset MSR-VTT prove the effectiveness of our topic-guided model, which significantly surpasses the winning performance in the 2016 MSR video to language challenge.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133896700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
TaiChi: A Fine-Grained Action Recognition Dataset 太极:一个细粒度动作识别数据集
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3079039
Shan Sun, Feng Wang, Qi Liang, Liang He
{"title":"TaiChi: A Fine-Grained Action Recognition Dataset","authors":"Shan Sun, Feng Wang, Qi Liang, Liang He","doi":"10.1145/3078971.3079039","DOIUrl":"https://doi.org/10.1145/3078971.3079039","url":null,"abstract":"In this paper, we introduce TaiChi which is a fine-grained action dataset. It consists of unconstrained user-uploaded web videos containing camera motion and partial occlusions which pose new challenges to fine-grained action recognition compared to the existing datasets. In this dataset, 2,772 samples of 58 fine-grained action classes are manually annotated. Additionally, we provide the baseline action recognition results using the state-of-the-art Improved Dense Trajectory feature and Fisher Vector representation with an MAP (Mean Average Precision) of 51.39%.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128990103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Cross-modal Image-Graphics Retrieval by Neural Transfer Learning 基于神经迁移学习的跨模态图像图形检索
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3078994
Fabian Junkert, Markus Eberts, A. Ulges, Ulrich Schwanecke
{"title":"Cross-modal Image-Graphics Retrieval by Neural Transfer Learning","authors":"Fabian Junkert, Markus Eberts, A. Ulges, Ulrich Schwanecke","doi":"10.1145/3078971.3078994","DOIUrl":"https://doi.org/10.1145/3078971.3078994","url":null,"abstract":"","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126191331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Session details: Oral Session 5: Best Paper Candidates 会议细节:口头会议5:最佳论文候选人
V. Mezaris
{"title":"Session details: Oral Session 5: Best Paper Candidates","authors":"V. Mezaris","doi":"10.1145/3254624","DOIUrl":"https://doi.org/10.1145/3254624","url":null,"abstract":"","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"363 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125821067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transductive Visual-Semantic Embedding for Zero-shot Learning 零次学习的换向视觉语义嵌入
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI: 10.1145/3078971.3078977
Xing Xu, Fumin Shen, Yang Yang, Jie Shao, Zi Huang
{"title":"Transductive Visual-Semantic Embedding for Zero-shot Learning","authors":"Xing Xu, Fumin Shen, Yang Yang, Jie Shao, Zi Huang","doi":"10.1145/3078971.3078977","DOIUrl":"https://doi.org/10.1145/3078971.3078977","url":null,"abstract":"Zero-shot learning (ZSL) aims to bridge the knowledge transfer via available semantic representations (e.g., attributes) between labeled source instances of seen classes and unlabelled target instances of unseen classes. Most existing ZSL approaches achieve this by learning a projection from the visual feature space to the semantic representation space based on the source instances, and directly applying it to the target instances. However, the intrinsic manifold structures residing in both semantic representations and visual features are not effectively incorporated into the learned projection function. Moreover, these methods may suffer from the inherent projection shift problem, due to the disjointness between seen and unseen classes. To overcome these drawbacks, we propose a novel framework termed transductive visual-semantic embedding (TVSE) for ZSL. In specific, TVSE first learns a latent embedding space to incorporate the manifold structures in both labeled source instances and unlabeled target instances under the transductive setting. In the learned space, each instance is viewed as a mixture of seen class scores. TVSE then effectively constructs the relational mapping between seen and unseen classes using the available semantic representations, and applies it to map the seen class scores of the target instances to their predictions of unseen classes. Extensive experiments on four benchmark datasets demonstrate that the proposed TVSE achieves competitive performance compared with the state-of-the-arts for zero-shot recognition and retrieval tasks.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123558516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Session details: Oral Session: Brave New Ideas 会议细节:口头会议:勇敢的新想法
C. Ngo
{"title":"Session details: Oral Session: Brave New Ideas","authors":"C. Ngo","doi":"10.1145/3254618","DOIUrl":"https://doi.org/10.1145/3254618","url":null,"abstract":"","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129506262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信