Proceedings of the 2020 International Conference on Multimedia Retrieval最新文献_第9页

Continuous Health Interface Event Retrieval 连续运行状况接口事件检索

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-04-16 DOI: 10.1145/3372278.3390705

Vaibhav Pandey, Nitish Nag, Ramesh C. Jain

引用次数: 6

Search Result Clustering in Collaborative Sound Collections 协同声音集合中的搜索结果聚类

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-04-08 DOI: 10.1145/3372278.3390691

Xavier Favory, F. Font, Xavier Serra

{"title":"Search Result Clustering in Collaborative Sound Collections","authors":"Xavier Favory, F. Font, Xavier Serra","doi":"10.1145/3372278.3390691","DOIUrl":"https://doi.org/10.1145/3372278.3390691","url":null,"abstract":"The large size of nowadays' online multimedia databases makes retrieving their content a difficult and time-consuming task. Users of online sound collections typically submit search queries that express a broad intent, often making the system return large and unmanageable result sets. Search Result Clustering is a technique that organises search-result content into coherent groups, which allows users to identify useful subsets in their results. Obtaining coherent and distinctive clusters that can be explored with a suitable interface is crucial for making this technique a useful complement of traditional search engines. In our work, we propose a graph-based approach using audio features for clustering diverse sound collections obtained when querying large online databases. We propose an approach to assess the performance of different features at scale, by taking advantage of the metadata associated with each sound. This analysis is complemented with an evaluation using ground-truth labels from manually annotated datasets. We show that using a confidence measure for discarding inconsistent clusters improves the quality of the partitions. After identifying the most appropriate features for clustering, we conduct an experiment with users performing a sound design task, in order to evaluate our approach and its user interface. A qualitative analysis is carried out including usability questionnaires and semi-structured interviews. This provides us with valuable new insights regarding the features that promote efficient interaction with the clusters.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125315089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Query-controllable Video Summarization 查询可控视频摘要

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-04-07 DOI: 10.1145/3372278.3390695

Jia-Hong Huang, M. Worring

{"title":"Query-controllable Video Summarization","authors":"Jia-Hong Huang, M. Worring","doi":"10.1145/3372278.3390695","DOIUrl":"https://doi.org/10.1145/3372278.3390695","url":null,"abstract":"When video collections become huge, how to explore both within and across videos efficiently is challenging. Video summarization is one of the ways to tackle this issue. Traditional summarization approaches limit the effectiveness of video exploration because they only generate one fixed video summary for a given input video independent of the information need of the user. In this work, we introduce a method which takes a text-based query as input and generates a video summary corresponding to it. We do so by modeling video summarization as a supervised learning problem and propose an end-to-end deep learning based method for query-controllable video summarization to generate a query-dependent video summary. Our proposed method consists of a video summary controller, video summary generator, and video summary output module. To foster the research of query-controllable video summarization and conduct our experiments, we introduce a dataset that contains frame-based relevance score labels. Based on our experimental result, it shows that the text-based query helps control the video summary. It also shows the text-based query improves our model performance. Our code and dataset: https://github.com/Jhhuangkay/Query-controllable-Video-Summarization.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"31 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131727429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency 使用跨模态实体一致性度量的真实世界新闻的多模态分析

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-03-23 DOI: 10.1145/3372278.3390670

Eric Müller-Budack, Jonas Theiner, Sebastian Diering, Maximilian Idahl, R. Ewerth

{"title":"Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency","authors":"Eric Müller-Budack, Jonas Theiner, Sebastian Diering, Maximilian Idahl, R. Ewerth","doi":"10.1145/3372278.3390670","DOIUrl":"https://doi.org/10.1145/3372278.3390670","url":null,"abstract":"The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121631222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

PredNet and Predictive Coding: A Critical Review PredNet与预测编码:综述

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-02-11 DOI: 10.1145/3372278.3390694

Roshan Prakash Rane, Edit Szugyi, V. Saxena, André Ofner, S. Stober

引用次数: 10

iCap: Interactive Image Captioning with Predictive Text iCap:带有预测文本的交互式图像字幕

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-01-31 DOI: 10.1145/3372278.3390697

Zhengxiong Jia, Xirong Li

引用次数: 8

Explaining with Counter Visual Attributes and Examples 用反视觉属性和例子解释

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-01-27 DOI: 10.1145/3372278.3390672

Sadaf Gulshad, A. Smeulders

{"title":"Explaining with Counter Visual Attributes and Examples","authors":"Sadaf Gulshad, A. Smeulders","doi":"10.1145/3372278.3390672","DOIUrl":"https://doi.org/10.1145/3372278.3390672","url":null,"abstract":"In this paper, we aim to explain the decisions of neural networks by utilizing multimodal information. That is counter-intuitive attributes and counter visual examples which appear when perturbed samples are introduced. Different from previous work on interpreting decisions using saliency maps, text, or visual patches we propose to use attributes and counter-attributes, and examples and counter-examples as part of the visual explanations. When humans explain visual decisions they tend to do so by providing attributes and examples. Hence, inspired by the way of human explanations in this paper we provide attribute-based and example-based explanations. Moreover, humans also tend to explain their visual decisions by adding counter-attributes and counter-examples to explain what isnot seen. We introduce directed perturbations in the examples to observe which attribute values change when classifying the examples into the counter classes. This delivers intuitive counter-attributes and counter-examples. Our experiments with both coarse and fine-grained datasets show that attributes provide discriminating and human-understandable intuitive and counter-intuitive explanations.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126665020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Automatic Reminiscence Therapy for Dementia 痴呆的自动回忆疗法

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2019-10-25 DOI: 10.1145/3372278.3391927

Mariona Carós, M. Garolera, P. Radeva, Xavier Giró-i-Nieto

引用次数: 24

Compact Network Training for Person ReID 紧凑型人际网络培训

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2019-10-15 DOI: 10.1145/3372278.3390686

Hussam Lawen, Avi Ben-Cohen, M. Protter, Itamar Friedman, Lihi Zelnik-Manor

引用次数: 10

Proceedings of the 2020 International Conference on Multimedia Retrieval 2020年多媒体检索国际会议论文集

Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 1900-01-01 DOI: 10.5555/3403712

引用次数: 1