Proceedings of the 2020 International Conference on Multimedia Retrieval最新文献

筛选
英文 中文
Continuous Health Interface Event Retrieval 连续运行状况接口事件检索
Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-04-16 DOI: 10.1145/3372278.3390705
Vaibhav Pandey, Nitish Nag, Ramesh C. Jain
{"title":"Continuous Health Interface Event Retrieval","authors":"Vaibhav Pandey, Nitish Nag, Ramesh C. Jain","doi":"10.1145/3372278.3390705","DOIUrl":"https://doi.org/10.1145/3372278.3390705","url":null,"abstract":"Knowing the state of our health at every moment in time is critical for advances in health science. Using data obtained outside an episodic clinical setting is the first step towards building a continuous health estimation system. In this paper, we explore a system that allows users to combine events and data streams from different sources and retrieve complex biological events, such as cardiovascular volume overload, using measured lifestyle events. These complex events, which have been explored in biomedical literature and which we call interface events, have a direct causal impact on the relevant biological systems; they are the interface through which the lifestyle events influence our health. We retrieve the interface events from existing events and data streams by encoding domain knowledge using the event operator language. The interface events can then be utilized to provide a continuous estimate of the biological variables relevant to the user's health state. The event-based framework also makes it easier to estimate which event is causally responsible for a particular change in the individual's health state.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128201999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Search Result Clustering in Collaborative Sound Collections 协同声音集合中的搜索结果聚类
Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-04-08 DOI: 10.1145/3372278.3390691
Xavier Favory, F. Font, Xavier Serra
{"title":"Search Result Clustering in Collaborative Sound Collections","authors":"Xavier Favory, F. Font, Xavier Serra","doi":"10.1145/3372278.3390691","DOIUrl":"https://doi.org/10.1145/3372278.3390691","url":null,"abstract":"The large size of nowadays' online multimedia databases makes retrieving their content a difficult and time-consuming task. Users of online sound collections typically submit search queries that express a broad intent, often making the system return large and unmanageable result sets. Search Result Clustering is a technique that organises search-result content into coherent groups, which allows users to identify useful subsets in their results. Obtaining coherent and distinctive clusters that can be explored with a suitable interface is crucial for making this technique a useful complement of traditional search engines. In our work, we propose a graph-based approach using audio features for clustering diverse sound collections obtained when querying large online databases. We propose an approach to assess the performance of different features at scale, by taking advantage of the metadata associated with each sound. This analysis is complemented with an evaluation using ground-truth labels from manually annotated datasets. We show that using a confidence measure for discarding inconsistent clusters improves the quality of the partitions. After identifying the most appropriate features for clustering, we conduct an experiment with users performing a sound design task, in order to evaluate our approach and its user interface. A qualitative analysis is carried out including usability questionnaires and semi-structured interviews. This provides us with valuable new insights regarding the features that promote efficient interaction with the clusters.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125315089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Query-controllable Video Summarization 查询可控视频摘要
Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-04-07 DOI: 10.1145/3372278.3390695
Jia-Hong Huang, M. Worring
{"title":"Query-controllable Video Summarization","authors":"Jia-Hong Huang, M. Worring","doi":"10.1145/3372278.3390695","DOIUrl":"https://doi.org/10.1145/3372278.3390695","url":null,"abstract":"When video collections become huge, how to explore both within and across videos efficiently is challenging. Video summarization is one of the ways to tackle this issue. Traditional summarization approaches limit the effectiveness of video exploration because they only generate one fixed video summary for a given input video independent of the information need of the user. In this work, we introduce a method which takes a text-based query as input and generates a video summary corresponding to it. We do so by modeling video summarization as a supervised learning problem and propose an end-to-end deep learning based method for query-controllable video summarization to generate a query-dependent video summary. Our proposed method consists of a video summary controller, video summary generator, and video summary output module. To foster the research of query-controllable video summarization and conduct our experiments, we introduce a dataset that contains frame-based relevance score labels. Based on our experimental result, it shows that the text-based query helps control the video summary. It also shows the text-based query improves our model performance. Our code and dataset: https://github.com/Jhhuangkay/Query-controllable-Video-Summarization.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"31 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131727429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency 使用跨模态实体一致性度量的真实世界新闻的多模态分析
Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-03-23 DOI: 10.1145/3372278.3390670
Eric Müller-Budack, Jonas Theiner, Sebastian Diering, Maximilian Idahl, R. Ewerth
{"title":"Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency","authors":"Eric Müller-Budack, Jonas Theiner, Sebastian Diering, Maximilian Idahl, R. Ewerth","doi":"10.1145/3372278.3390670","DOIUrl":"https://doi.org/10.1145/3372278.3390670","url":null,"abstract":"The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. The photos can be decorative, depict additional details, or even contain misleading information. Quantifying the cross-modal consistency of entity representations can assist human assessors in evaluating the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today's society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of these entities with the news photo, using state-of-the-art computer vision approaches. In contrast to previous work, our system automatically gathers example data from the Web and is applicable to real-world news. The feasibility is demonstrated on two novel datasets that cover different languages, topics, and domains.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121631222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
PredNet and Predictive Coding: A Critical Review PredNet与预测编码:综述
Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-02-11 DOI: 10.1145/3372278.3390694
Roshan Prakash Rane, Edit Szugyi, V. Saxena, André Ofner, S. Stober
{"title":"PredNet and Predictive Coding: A Critical Review","authors":"Roshan Prakash Rane, Edit Szugyi, V. Saxena, André Ofner, S. Stober","doi":"10.1145/3372278.3390694","DOIUrl":"https://doi.org/10.1145/3372278.3390694","url":null,"abstract":"PredNet, a deep predictive coding network developed by Lotter et al., combines a biologically inspired architecture based on the propagation of prediction error with self-supervised representation learning in video. While the architecture has drawn a lot of attention and various extensions of the model exist, there is a lack of a critical analysis. We fill in the gap by evaluating PredNet both as an implementation of the predictive coding theory and as a self-supervised video prediction model using a challenging video action classification dataset. We design an extended model to test if conditioning future frame predictions on the action class of the video improves the model performance. We show that PredNet does not yet completely follow the principles of predictive coding. The proposed top-down conditioning leads to a performance gain on synthetic data, but does not scale up to the more complex real-world action classification dataset. Our analysis is aimed at guiding future research on similar architectures based on the predictive coding theory.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128770812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
iCap: Interactive Image Captioning with Predictive Text iCap:带有预测文本的交互式图像字幕
Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-01-31 DOI: 10.1145/3372278.3390697
Zhengxiong Jia, Xirong Li
{"title":"iCap: Interactive Image Captioning with Predictive Text","authors":"Zhengxiong Jia, Xirong Li","doi":"10.1145/3372278.3390697","DOIUrl":"https://doi.org/10.1145/3372278.3390697","url":null,"abstract":"In this paper we study a brand new topic of interactive image captioning with human in the loop. Different from automated image captioning where a given test image is the sole input in the inference stage, we have access to both the test image and a sequence of (incomplete) user-input sentences in the interactive scenario. We formulate the problem as Visually Conditioned Sentence Completion (VCSC). For VCSC, we propose ABD-Cap, asynchronous bidirectional decoding for image caption completion. With ABD-Cap as the core module, we build iCap, a web-based interactive image captioning system capable of predicting new text with respect to live input from a user. A number of experiments covering both automated evaluations and real user studies show the viability of our proposals.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130060989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Explaining with Counter Visual Attributes and Examples 用反视觉属性和例子解释
Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2020-01-27 DOI: 10.1145/3372278.3390672
Sadaf Gulshad, A. Smeulders
{"title":"Explaining with Counter Visual Attributes and Examples","authors":"Sadaf Gulshad, A. Smeulders","doi":"10.1145/3372278.3390672","DOIUrl":"https://doi.org/10.1145/3372278.3390672","url":null,"abstract":"In this paper, we aim to explain the decisions of neural networks by utilizing multimodal information. That is counter-intuitive attributes and counter visual examples which appear when perturbed samples are introduced. Different from previous work on interpreting decisions using saliency maps, text, or visual patches we propose to use attributes and counter-attributes, and examples and counter-examples as part of the visual explanations. When humans explain visual decisions they tend to do so by providing attributes and examples. Hence, inspired by the way of human explanations in this paper we provide attribute-based and example-based explanations. Moreover, humans also tend to explain their visual decisions by adding counter-attributes and counter-examples to explain what isnot seen. We introduce directed perturbations in the examples to observe which attribute values change when classifying the examples into the counter classes. This delivers intuitive counter-attributes and counter-examples. Our experiments with both coarse and fine-grained datasets show that attributes provide discriminating and human-understandable intuitive and counter-intuitive explanations.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126665020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Automatic Reminiscence Therapy for Dementia 痴呆的自动回忆疗法
Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2019-10-25 DOI: 10.1145/3372278.3391927
Mariona Carós, M. Garolera, P. Radeva, Xavier Giró-i-Nieto
{"title":"Automatic Reminiscence Therapy for Dementia","authors":"Mariona Carós, M. Garolera, P. Radeva, Xavier Giró-i-Nieto","doi":"10.1145/3372278.3391927","DOIUrl":"https://doi.org/10.1145/3372278.3391927","url":null,"abstract":"With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily. It affects more than 46 million people worldwide, and it is estimated that in 2050 more than 100 million will be affected. While there are no effective treatments for these terminal diseases, therapies such as reminiscence, that stimulate memories from the past are recommended. Currently, reminiscence therapy takes place in care homes and is guided by a therapist or a carer. In this work, we present an AI-based solution to automate the reminiscence therapy. This consists of a dialogue system that uses photos of the users as input to generate questions about their life. Overall, this paper presents how reminiscence therapy can be automated by using deep learning, and deployed to smartphones and laptops, making the therapy more accessible to every person affected by dementia.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126046822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Compact Network Training for Person ReID 紧凑型人际网络培训
Proceedings of the 2020 International Conference on Multimedia Retrieval Pub Date : 2019-10-15 DOI: 10.1145/3372278.3390686
Hussam Lawen, Avi Ben-Cohen, M. Protter, Itamar Friedman, Lihi Zelnik-Manor
{"title":"Compact Network Training for Person ReID","authors":"Hussam Lawen, Avi Ben-Cohen, M. Protter, Itamar Friedman, Lihi Zelnik-Manor","doi":"10.1145/3372278.3390686","DOIUrl":"https://doi.org/10.1145/3372278.3390686","url":null,"abstract":"The task of person re-identification (ReID) has attracted growing attention in recent years leading to improved performance, albeit with little focus on real-world applications. Most SotA methods are based on heavy pre-trained models, e.g. ResNet50 (~25M parameters), which makes them less practical and more tedious to explore architecture modifications. In this study, we focus on a small-sized randomly initialized model that enables us to easily introduce architecture and training modifications suitable for person ReID. The outcomes of our study are a compact network and a fitting training regime. We show the robustness of the network by outperforming the SotA on both Market1501 and DukeMTMC. Furthermore, we show the representation power of our ReID network via SotA results on a different task of multi-object tracking.","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127408721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Proceedings of the 2020 International Conference on Multimedia Retrieval 2020年多媒体检索国际会议论文集
{"title":"Proceedings of the 2020 International Conference on Multimedia Retrieval","authors":"","doi":"10.5555/3403712","DOIUrl":"https://doi.org/10.5555/3403712","url":null,"abstract":"","PeriodicalId":158014,"journal":{"name":"Proceedings of the 2020 International Conference on Multimedia Retrieval","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134145320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信