Proceedings of the 2020 International Conference on Multimedia Retrieval最新文献_第4页

Semantic Gated Network for Efficient News Representation 高效新闻表示的语义门控网络

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-08 DOI: 10.1145/3372278.3390719

Xuxiao Bu, Bingfeng Li, Yaxiong Wang, Jihua Zhu, Xueming Qian, Marco Zhao

{"title":"Semantic Gated Network for Efficient News Representation","authors":"Xuxiao Bu, Bingfeng Li, Yaxiong Wang, Jihua Zhu, Xueming Qian, Marco Zhao","doi":"10.1145/3372278.3390719","DOIUrl":"https://doi.org/10.1145/3372278.3390719","url":null,"abstract":"Learning an efficient news representation is a fundamental yet important problem for many tasks. Most existing news-relevant methods only take the textual information while abandoning the visual clues from the illustrations. We argue that the textual title and tags together with the visual illustrations form the main force of a piece of news and are more efficient to express the news content. In this paper, we develop a novel framework, namely Semantic Gated Network (SGN), to integrate the news title, tags and visual illustrations to obtain an efficient joint textual-visual feature for the news, by which we can directly measure the relevance between two pieces of news. Particularly, we first harvest the tag embeddings by the proposed self-supervised classification model. Besides, news title is fed into a sentence encoder pretrained by two semantically relevant news to learn efficient contextualized word vectors. Then the feature of the news title is extracted based on the learned vectors and we combine it with features of tags to obtain textual feature. Finally, we design a novel mechanism named semantic gate to adaptively fuse the textual feature and the image feature. Extensive experiments on benchmark dataset demonstrate the effectiveness of our approach.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"240 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134145963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Music Tower Blocks: Multi-Faceted Exploration Interface for Web-Scale Music Access 音乐塔块:面向网络规模音乐访问的多面探索界面

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-08 DOI: 10.1145/3372278.3391928

M. Schedl, Michael Mayr, Peter Knees

引用次数: 4

An Active Learning Framework for Duplicate Detection in SaaS Platforms SaaS平台中重复检测的主动学习框架

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-08 DOI: 10.1145/3372278.3391933

Quy H. Nguyen, Dac H. Nguyen, Minh-Son Dao, Duc-Tien Dang-Nguyen, C. Gurrin, Binh T. Nguyen

{"title":"An Active Learning Framework for Duplicate Detection in SaaS Platforms","authors":"Quy H. Nguyen, Dac H. Nguyen, Minh-Son Dao, Duc-Tien Dang-Nguyen, C. Gurrin, Binh T. Nguyen","doi":"10.1145/3372278.3391933","DOIUrl":"https://doi.org/10.1145/3372278.3391933","url":null,"abstract":"With the rapid growth of users' data in SaaS (Software-as-a-service) platforms using micro-services, it becomes essential to detect duplicated entities for ensuring the integrity and consistency of data in many companies and businesses (primarily multinational corporations). Due to the large volume of databases today, the expected duplicate detection algorithms need to be not only accurate but also practical, which means that it can release the detection results as fast as possible for a given request. Among existing algorithms for the deduplicate detection problem, using Siamese neural networks with the triplet loss has become one of the robust ways to measure the similarity of two entities (texts, paragraphs, or documents) for identifying all possible duplicated items. In this paper, we first propose a practical framework for building a duplicate detection system in a SaaS platform. Second, we present a new active learning schema for training and updating duplicate detection algorithms. In this schema, we not only allow the crowd to provide more annotated data for enhancing the chosen learning model but also use the Siamese neural networks as well as the triplet loss to construct an efficient model for the problem. Finally, we design a user interface of our proposed deduplicate detection system, which can easily apply for empirical applications in different companies.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132849299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Graph Group Collaborative Filtering 多图组协同过滤

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-08 DOI: 10.1145/3372278.3390715

Bo Jiang

{"title":"Multi-Graph Group Collaborative Filtering","authors":"Bo Jiang","doi":"10.1145/3372278.3390715","DOIUrl":"https://doi.org/10.1145/3372278.3390715","url":null,"abstract":"The task of recommending an item or an event to a user group attracts wide attention. Most existing works obtain group preference by aggregating personalized preferences in the same group. However, the groups, users, and items are connected in a more complex structure, e.g.the users in the same group may have different preferences. Thus, it is important to introduce correlations among groups, users, and items into embedding learning. To address this problem, we propose Multi-Graph Group Collaborative Filtering (MGGCF), which refines the group, user and item representations according to three bipartite graphs. Moreover, since MGGCF refines the group, user and item embeddings simultaneously, it would benefit both the group recommendation tasks and the individual recommendation tasks. Extensive experiments are conducted on one real-world dataset and two synthetic datasets. Empirical results demonstrate that MGGCF significantly improves not only the group recommendation but also the item recommendation. Further analysis verifies the importance of embedding propagation for learning better user, group, item representations, which reveals the rationality and effectiveness of MGGCF.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"209 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134536729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Attention Mechanisms, Signal Encodings and Fusion Strategies for Improved Ad-hoc Video Search with Dual Encoding Networks 双编码网络改进自组织视频搜索的注意机制、信号编码和融合策略

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-08 DOI: 10.1145/3372278.3390737

Damianos Galanopoulos, V. Mezaris

引用次数: 15

Google Helps YouTube: Learning Few-Shot Video Classification from Historic Tasks and Cross-Domain Sample Transfer 谷歌帮助YouTube:从历史任务和跨域样本传输中学习少量视频分类

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-08 DOI: 10.1145/3372278.3390687

Xinzhe Zhou, Yadong Mu

{"title":"Google Helps YouTube: Learning Few-Shot Video Classification from Historic Tasks and Cross-Domain Sample Transfer","authors":"Xinzhe Zhou, Yadong Mu","doi":"10.1145/3372278.3390687","DOIUrl":"https://doi.org/10.1145/3372278.3390687","url":null,"abstract":"The fact that video annotation is labor-intensive inspires recent research to endeavor on few-shot video classification. The core motivation of our work is to mitigate the supervision scarcity issue in this few-shot setting via cross-domain meta-learning. Particularly, we aim to harness large-scale richly-annotated image data (i.e., source domain) for few-shot video classification (i.e., target domain). The source data is heterogeneous (image v.s. video) and has noisy labels, not directly usable in the target domain. This work proposes meta-learning input-transformer (MLIT), a novel deep network that tames the noisy source data such that they are more amenable for being used in the target domain. It has two key traits. First, to bridge the data distribution gap between source / target domains, MLIT includes learnable neural layers to reweigh and transform the source data, effectively suppressing corrupted or noisy source data. Secondly, MLIT is designed to learn from historic video classification tasks in the target domain, which significantly elevates the accuracy of the unseen video category. Comprehensive empirical evaluations on two large-scale video datasets, ActivityNet and Kinetics-400, have strongly shown the superiority of our proposed method.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"392 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121252114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Itinerary Planning via Deep Reinforcement Learning 基于深度强化学习的行程规划

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-08 DOI: 10.1145/3372278.3390727

Shengxin Chen, Bo-Hao Chen, Zhaojiong Chen, Yunbing Wu

引用次数: 6

Emotion Recognition from Galvanic Skin Response Signal Based on Deep Hybrid Neural Networks 基于深度混合神经网络的皮肤电反应信号情感识别

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-08 DOI: 10.1145/3372278.3390738

Imam Yogie Susanto, Tse-Yu Pan, Chien-Wen Chen, Min-Chun Hu, Wen-Huang Cheng

引用次数: 9

YOLO-mini-tiger: Amur Tiger Detection yolo -迷你老虎:东北虎探测

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-01 DOI: 10.1145/3372278.3390710

Runchen Wei, Ning He, K. Lu

{"title":"YOLO-mini-tiger: Amur Tiger Detection","authors":"Runchen Wei, Ning He, K. Lu","doi":"10.1145/3372278.3390710","DOIUrl":"https://doi.org/10.1145/3372278.3390710","url":null,"abstract":"In this paper, we present our solution for tiger detection in the 2019 Computer Vision for Wildlife Conservation Challenge (CVWC2019). We introduce an efficient deep tiger detector, which consists of the convnet channel adaptation method and an improved tiger detection method based on You Only Look Once version 3 (YOLOv3). Considering the limited memory and computing power of tiny embedded devices, we have used EfficientNet-B0 and Darknet-53 as backbone networks for detection and adapted them to balance their depth and width inspired by the channel pruning method and knowledge distillation method. Our results show that after an architecture adjustment of Darknet-53, the floating-point computation decreases by 93%, its model size decreases by 97%, and its accuracy only decreases by 1%; after an architecture adjustment of EfficientNet-B0, the floating-point computation decreases by 66%, its model size decreases by 70% with its accuracy only decreased by 1%. We also compare GIoU loss and MSE loss in the training stage. The GIoU loss has the advantage that it increases the average AP for IoU from 0.5 to 0.95 without affecting training speed and the interface speed, so it is experimentally reasonable for tiger detection in the wild. This proposed method outperforms previous Amur tiger detection methods presented at CVWC2019.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"380 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115477694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Super-Resolution Coding Defense Against Adversarial Examples 针对对抗性示例的超分辨率编码防御

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-06-01 DOI: 10.1145/3372278.3390689

Yanjie Chen, Likun Cai, Wei Cheng, Hongya Wang

引用次数: 3