Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval最新文献

筛选
英文 中文
Dictionary Learning based Supervised Discrete Hashing for Cross-Media Retrieval 基于字典学习的有监督离散散列跨媒体检索
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206045
Ye Wu, Xin Luo, Xin-Shun Xu, Shanqing Guo, Yuliang Shi
{"title":"Dictionary Learning based Supervised Discrete Hashing for Cross-Media Retrieval","authors":"Ye Wu, Xin Luo, Xin-Shun Xu, Shanqing Guo, Yuliang Shi","doi":"10.1145/3206025.3206045","DOIUrl":"https://doi.org/10.1145/3206025.3206045","url":null,"abstract":"Hashing technique has attracted considerable attention for large-scale multimedia retrieval due to its low storage cost and fast query speed. Moreover, many hashing models have been proposed for cross-modal retrieval task. However, there are still some problems that need to be further considered. For example, a majority of them directly use linear projection matrix to project heterogeneous data into a common space, which may lead to large error as there are some heterogeneous data with semantic similarity hard to be close in latent space when linear projection is used. Besides, most existing cross-modal hashing methods use a simple pairwise similarity matrix for preserving the label information when learning. This kind of pairwise similarity cannot fully utilize the discriminative property of label information. Furthermore, most existing supervised ones try to solve a relaxed continuous optimization problem by dropping the discrete constraints, which may lead to large quantization error. To overcome these limitations, in this paper, we propose a novel cross-modal hashing method, called Dictionary Learning based Supervised Discrete Hashing (DLSDH). Specifically, it learns dictionaries and generates sparse representation for every instance, which is more suitable to be projected to a latent space. To make full use of label information, it uses cosine similarity to construct a new pairwise similarity matrix which can contain more information. Moreover, it directly learns the discrete hash codes instead of relaxing the discrete constraints. Extensive experiments are conducted on three benchmark datasets and the results demonstrate that it outperforms several state-of-the-art methods for cross-modal retrieval task.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134230522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Session details: Doctoral Symposium Session 会议详情:博士研讨会
Martha Larson Takahiro Ogaawa
{"title":"Session details: Doctoral Symposium Session","authors":"Martha Larson Takahiro Ogaawa","doi":"10.1145/3252933","DOIUrl":"https://doi.org/10.1145/3252933","url":null,"abstract":"","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115561297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Special Session 2: Social-Media Visual Summarization / Large-Scale 3D Multimedia Analysis and Applications 专题会议2:社交媒体可视化总结/大规模3D多媒体分析与应用
Joao Magalhaes Rongrong Ji
{"title":"Session details: Special Session 2: Social-Media Visual Summarization / Large-Scale 3D Multimedia Analysis and Applications","authors":"Joao Magalhaes Rongrong Ji","doi":"10.1145/3252932","DOIUrl":"https://doi.org/10.1145/3252932","url":null,"abstract":"","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114877771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Pairwise Classification and Ranking for Predicting Media Interestingness 预测媒体兴趣度的深度两两分类和排序
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206078
Jayneel Parekh, Harshvardhan Tibrewal, Sanjeel Parekh
{"title":"Deep Pairwise Classification and Ranking for Predicting Media Interestingness","authors":"Jayneel Parekh, Harshvardhan Tibrewal, Sanjeel Parekh","doi":"10.1145/3206025.3206078","DOIUrl":"https://doi.org/10.1145/3206025.3206078","url":null,"abstract":"With the explosive increase in the consumption of multimedia content in recent years, the field of media interestingness analysis has gained a lot of attention. This paper tackles the problem of image interestingness in videos and proposes a novel algorithm based on pairwise-comparisons of frames to rank all frames in a video. Experiments performed on the Predicting Media Interestingness dataset, affirm its effectiveness over existing solutions. In terms of the official metric i.e. Mean Average Precision at 10, it outperforms the previous state-of-the-art (to the best of our knowledge) on this dataset. Additional results on video interestingness substantiate the flexibility and performance reliability of our approach.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131106554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Exploiting Relational Information in Social Networks using Geometric Deep Learning on Hypergraphs 利用超图上的几何深度学习挖掘社交网络中的关系信息
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206062
Devanshu Arya, M. Worring
{"title":"Exploiting Relational Information in Social Networks using Geometric Deep Learning on Hypergraphs","authors":"Devanshu Arya, M. Worring","doi":"10.1145/3206025.3206062","DOIUrl":"https://doi.org/10.1145/3206025.3206062","url":null,"abstract":"Online social networks are constituted by a diverse set of entities including users, images and posts which makes the task of predicting interdependencies between entities challenging. We need a model that transfers information from a given type of relations between entities to predict other types of relations, irrespective of the type of entity. In order to devise a generic framework, one needs to capture the relational information between entities without any entity dependent information. However, there are two challenges: (a) a social network has an intrinsic community structure. In these communities, some relations are much more complicated than pairwise relations, thus cannot be simply modeled by a graph; (b) there are different types of entities and relations in a social network, taking into account all of them makes it difficult to formulate a model. In this paper, we claim that representing social networks using hypergraphs improves the task of predicting missing information about an entity by capturing higher-order relations. We study the behavior of our method by performing experiments on CLEF dataset consisting of images from Flickr, an online photo sharing social network.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127288654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Learning Multilevel Semantic Similarity for Large-Scale Multi-Label Image Retrieval 面向大规模多标签图像检索的多级语义相似度学习
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206027
Ge Song, Xiaoyang Tan
{"title":"Learning Multilevel Semantic Similarity for Large-Scale Multi-Label Image Retrieval","authors":"Ge Song, Xiaoyang Tan","doi":"10.1145/3206025.3206027","DOIUrl":"https://doi.org/10.1145/3206025.3206027","url":null,"abstract":"We present a novel Deep Supervised Hashing with code operation (DSOH) method for large-scale multi-label image retrieval. This approach is in contrast with existing methods in that we respect both the intention gap and the intrinsic multilevel similarity of multi-labels. Particularly, our method allows a user to simultaneously present multiple query images rather than a single one to better express her intention, and correspondingly a separate sub-network in our architecture is specifically designed to fuse the query intention represented by each single query. Furthermore, as in the training stage, each image is annotated with multiple labels to enrich its semantic representation, we propose a new margin-adaptive triplet loss to learn the fine-grained similarity structure of multi-labels, which is known to be hard to capture. The whole system is trained in an end-to-end manner, and our experimental results demonstrate that the proposed method is not only able to learn useful multilevel semantic similarity-preserving binary codes but also achieves state-of-the-art retrieval performance on three popular datasets.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122400881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
The Ongoing Evolution of Broadcast Technology 广播技术的持续发展
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3210489
K. Mitani
{"title":"The Ongoing Evolution of Broadcast Technology","authors":"K. Mitani","doi":"10.1145/3206025.3210489","DOIUrl":"https://doi.org/10.1145/3206025.3210489","url":null,"abstract":"The media environment of program production, content delivery, and viewing has been changing because of progress in broadcasting and communication technologies and other technologies like IoT, cloud computing, and artificial intelligence (AI). In December 2018, 8K and 4K UHDTV satellite broadcasting will start in Japan, which means that viewers will soon be able to enjoy 8K and 4K programs featuring a wide color gamut and high dynamic range characteristics together with 22.2 multi-channel audio at home. Meanwhile, distribution services for sending content to PCs and smartphones through the Internet have rapidly been spreading and the introduction of the next generation of mobile networks (5G) will accelerate their spread. The coming of such advanced broadcast and broadband technologies and consequent changes in lifestyle will provide broadcasters with a great opportunity for a new stage of development. At NHK Science & Technology Research Laboratories (NHK STRL), we are pursuing a wide range of research with the aim of creating new broadcast services that can provide viewing experiences never before imagined and user experiences more attuned to daily life. To enhance the convenience of television and the value of TV programming, we are developing technology for connecting the TV experience with various activities in everyday life. Extensions to \"Hybridcast Connect\" will drive applications that link TVs, smartphones, and IoT. They will enable spontaneous consumption of content during everyday activities through various devices around the user. Establishing a new program production workflow with AI, which we call \"Smart Production\", is one of our most important research topics. We are developing speech and face recognition technologies for making closed captions and metadata efficiently, as well as technologies for automatically converting content into computer-generated sign language, audio descriptions, and simplified Japanese. This presentation introduces these research achievements targeting 2020 and beyond, as well as other broadcasting technology trends including 4K8K UHDTV broadcasting in Japan, 3D imaging, and VR/AR.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121381752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature Reconstruction by Laplacian Eigenmaps for Efficient Instance Search 基于拉普拉斯特征映射的高效实例搜索特征重构
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206032
Bingqing Ke, Jie Shao, Zi Huang, Heng Tao Shen
{"title":"Feature Reconstruction by Laplacian Eigenmaps for Efficient Instance Search","authors":"Bingqing Ke, Jie Shao, Zi Huang, Heng Tao Shen","doi":"10.1145/3206025.3206032","DOIUrl":"https://doi.org/10.1145/3206025.3206032","url":null,"abstract":"Instance search aims at retrieving images containing a particular query instance. Recently, image features derived from pre-trained convolutional neural networks (CNNs) have been shown to provide promising performance for image retrieval. However, the robustness of these features is still limited by hard positives and hard negatives. To address this issue, this work focuses on reconstructing a new representation based on conventional CNN features to capture the intrinsic image manifold in the original feature space. After the feature reconstruction, the Euclidean distance can be applied in the new space to measure the pairwise distance among feature points. The proposed method is highly efficient, which benefits from the linear search complexity and a further optimization for speedup. Experiments demonstrate that our method achieves promising efficiency with highly competitive accuracy. This work succeeds in capturing implicit embedding information in images as well as reducing the computational complexity significantly.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127774687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Personal Basketball Coach: Tactic Training through Wireless Virtual Reality 私人篮球教练:无线虚拟现实战术训练
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206084
Wan-Lun Tsai
{"title":"Personal Basketball Coach: Tactic Training through Wireless Virtual Reality","authors":"Wan-Lun Tsai","doi":"10.1145/3206025.3206084","DOIUrl":"https://doi.org/10.1145/3206025.3206084","url":null,"abstract":"In this paper, we present a basketball tactic training framework with the aid of virtual reality (VR) technology to improve the effectiveness and experience of tactic learning. Our proposal is composed of 1) a wireless VR interaction system with motion capture devices which is applicable in the fast movement basketball running scenario; 2) a computing server that generates three-dimensional virtual players, defenders, and advantageous tactics guide. By the assistance of our VR training system, the user can vividly experience how the tactics are executed by viewing from the a specific player's viewing direction. Moreover, the basketball tactic movement guidance and virtual defenders are rendered in our VR system to make the users feel like playing in a real basketball game, which improves the efficiency and effectiveness of tactics training.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132438642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Session details: Poster Paper Session 会议详情:海报研讨会
Keiji Yanai
{"title":"Session details: Poster Paper Session","authors":"Keiji Yanai","doi":"10.1145/3252930","DOIUrl":"https://doi.org/10.1145/3252930","url":null,"abstract":"","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134533323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信