Proceedings of the 1st ACM International Conference on Multimedia Retrieval最新文献

筛选
英文 中文
Attribute-based vehicle search in crowded surveillance videos 拥挤监控视频中基于属性的车辆搜索
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992014
R. Feris, Behjat Siddiquie, Y. Zhai, James Petterson, L. Brown, Sharath Pankanti
{"title":"Attribute-based vehicle search in crowded surveillance videos","authors":"R. Feris, Behjat Siddiquie, Y. Zhai, James Petterson, L. Brown, Sharath Pankanti","doi":"10.1145/1991996.1992014","DOIUrl":"https://doi.org/10.1145/1991996.1992014","url":null,"abstract":"We present a novel application for searching for vehicles in surveillance videos based on semantic attributes. At the interface, the user specifies a set of vehicle characteristics (such as color, direction of travel, speed, length, height, etc.) and the system automatically retrieves video events that match the provided description. A key differentiating aspect of our system is the ability to handle challenging urban conditions such as high volumes of activity and environmental factors. This is achieved through a novel multi-view vehicle detection approach which relies on what we call motionlet classifiers, i.e. classifiers that are learned with vehicle samples clustered in the motion configuration space. We employ massively parallel feature selection to learn compact and accurate motionlet detectors. Moreover, in order to deal with different vehicle types (buses, trucks, SUVs, cars), we learn the motionlet detectors in a shape-free appearance space, where all training samples are resized to the same aspect ratio, and then during test time the aspect ratio of the sliding window is changed to allow the detection of different vehicle types. Once a vehicle is detected and tracked over the video, fine-grained attributes are extracted and ingested into a database to allow future search queries such as \"Show me all blue trucks larger than 7ft length traveling at high speed northbound last Saturday, from 2pm to 5pm\".","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126487775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Adaptive clustering and interactive visualizations to support the selection of video clips 自适应聚类和交互式可视化,支持视频剪辑的选择
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992030
Andreas Girgensohn, F. Shipman, L. Wilcox
{"title":"Adaptive clustering and interactive visualizations to support the selection of video clips","authors":"Andreas Girgensohn, F. Shipman, L. Wilcox","doi":"10.1145/1991996.1992030","DOIUrl":"https://doi.org/10.1145/1991996.1992030","url":null,"abstract":"Although people are capturing more video with their mobile phones, digital cameras, and other devices, they rarely watch all that video. More commonly, users extract a still image from the video to print or a short clip to share with others. We created a novel interface for browsing through a video keyframe hierarchy to find frames or clips. The interface is shown to be more efficient than scrolling linearly through all keyframes. We developed algorithms for selecting quality keyframes and for clustering keyframes hierarchically. At each level of the hierarchy, a single representative keyframe from each cluster is shown. Users can drill down into the most promising cluster and view representative keyframes for the sub-clusters. Our clustering algorithms optimize for short navigation paths to the desired keyframe. A single keyframe is located using a non-temporal clustering algorithm. A video clip is located using one of two temporal clustering algorithms. We evaluated the clustering algorithms using a simulated search task. User feedback provided us with valuable suggestions for improvements to our system.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125906364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Locally regressive G-optimal design for image retrieval 图像检索的局部回归g -最优设计
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992055
Zhengjun Zha, Yantao Zheng, Meng Wang, Fei Chang, Tat-Seng Chua
{"title":"Locally regressive G-optimal design for image retrieval","authors":"Zhengjun Zha, Yantao Zheng, Meng Wang, Fei Chang, Tat-Seng Chua","doi":"10.1145/1991996.1992055","DOIUrl":"https://doi.org/10.1145/1991996.1992055","url":null,"abstract":"Content Based Image Retrieval (CBIR) has attracted increasing attention from both academia and industry. Relevance Feedback is one of the most effective techniques to bridge the semantic gap in CBIR. One of the key research problems related to relevance feedback is how to select the most informative images for users to label. In this paper, we propose a novel active learning algorithm, called Locally Regressive G-Optimal Design (LRGOD) for relevance feedback image retrieval. Our assumption is that for each image, its label can be well estimated based on its neighbors via a locally regressive function. LRGOD algorithm is developed based on a locally regressive least squares model which makes use of the labeled and unlabeled images, as well as simultaneously exploits the local structure of each image. The images that can minimize the maximum prediction variance are selected as the most informative ones. We evaluated the proposed LRGOD approach on two real-world image corpus: Corel and NUS-WIDE-OBJECT [5] datasets, and compare it to three state-of-the-art active learning methods. The experimental results demonstrate the effectiveness of the proposed approach.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126974491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A flexible environment for multimedia management and publishing 灵活的多媒体管理和发布环境
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992072
M. Bertini, A. Bimbo, G. Ioannidis, Alexandru Stan, Emile Bijk
{"title":"A flexible environment for multimedia management and publishing","authors":"M. Bertini, A. Bimbo, G. Ioannidis, Alexandru Stan, Emile Bijk","doi":"10.1145/1991996.1992072","DOIUrl":"https://doi.org/10.1145/1991996.1992072","url":null,"abstract":"In this paper, we describe the IM3I system, which provides a flexible approach to managing and publishing collections of images and videos. The system is based on web services that allow automatic and manual annotation, retrieval, browsing and authoring of multimedia. Results of user evaluations, performed by professional archivists and archive managers on a real-world system deployment have confirmed that the system is easy to be used and delivers a complete set of functionalities.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133664412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An eye-tracking-based approach to facilitate interactive video search 基于眼动追踪的交互式视频搜索方法
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992039
S. Vrochidis, I. Patras, Y. Kompatsiaris
{"title":"An eye-tracking-based approach to facilitate interactive video search","authors":"S. Vrochidis, I. Patras, Y. Kompatsiaris","doi":"10.1145/1991996.1992039","DOIUrl":"https://doi.org/10.1145/1991996.1992039","url":null,"abstract":"This paper investigates the role of gaze movements as implicit user feedback during interactive video retrieval tasks. In this context, we use a content-based video search engine to perform an interactive video retrieval experiment, during which, we record the user gaze movements with the aid of an eye-tracking device and generate features for each video shot based on aggregated past user eye fixation and pupil dilation data. Then, we employ support vector machines, in order to train a classifier that could identify shots marked as relevant to a new query topic submitted by new users. The positive results provided by the classifier are used as recommendations for future users, who search for similar topics. The evaluation shows that important information can be extracted from aggregated gaze movements during video retrieval tasks, while the involvement of pupil dilation data improves the performance of the system and facilitates interactive video search.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133093845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A kernel density based approach for large scale image retrieval 基于核密度的大规模图像检索方法
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992024
Wei Tong, Fengjie Li, Tianbao Yang, Rong Jin, Anil K. Jain
{"title":"A kernel density based approach for large scale image retrieval","authors":"Wei Tong, Fengjie Li, Tianbao Yang, Rong Jin, Anil K. Jain","doi":"10.1145/1991996.1992024","DOIUrl":"https://doi.org/10.1145/1991996.1992024","url":null,"abstract":"Local image features, such as SIFT descriptors, have been shown to be effective for content-based image retrieval (CBIR). In order to achieve efficient image retrieval using local features, most existing approaches represent an image by a bag-of-words model in which every local feature is quantized into a visual word. Given the bag-of-words representation for images, a text search engine is then used to efficiently find the matched images for a given query. The main drawback with these approaches is that the two key steps, i.e., key point quantization and image matching, are separated, leading to sub-optimal performance in image retrieval. In this work, we present a statistical framework for large-scale image retrieval that unifies key point quantization and image matching by introducing kernel density function. The key ideas of the proposed framework are (a) each image is represented by a kernel density function from which the observed key points are sampled, and (b) the similarity of a gallery image to a query image is estimated as the likelihood of generating the key points in the query image by the kernel density function of the gallery image. We present efficient algorithms for kernel density estimation as well as for effective image matching. Experiments with large-scale image retrieval confirm that the proposed method is not only more effective but also more efficient than the state-of-the-art approaches in identifying visually similar images for given queries from large image databases.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115162866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Active learning through notes data in Flickr: an effortless training data acquisition approach for object localization 通过Flickr中的笔记数据进行主动学习:一种轻松的目标定位训练数据获取方法
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992042
Lei Zhang, Jun Ma, C. Cui, Piji Li
{"title":"Active learning through notes data in Flickr: an effortless training data acquisition approach for object localization","authors":"Lei Zhang, Jun Ma, C. Cui, Piji Li","doi":"10.1145/1991996.1992042","DOIUrl":"https://doi.org/10.1145/1991996.1992042","url":null,"abstract":"Most of the state-of-the-art systems for object localization rely on supervised machine learning techniques, and are thus limited by the lack of labeled training data. In this paper, our motivation is to provide training dataset for object localization effectively and efficiently. We argue that the notes data in Flickr can be exploited as a novel source for object modeling. At first, we apply a text mining method to gather semantically related images for a specific class. Then a handful of images are selected manually as seed images or initial training set. At last, the training set is expanded by an incremental active learning framework. Our approach requires significantly less manual supervision compared to standard methods. The experimental results on the PASCAL VOC 2007 and NUS-WIDE datasets show that the training data acquired by our approach can complement or even substitute conventional training data for object localization.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"11 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123310689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Scene-based image retrieval by transitive matching 基于传递匹配的场景图像检索
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992043
A. Ulges, Christian Schulze
{"title":"Scene-based image retrieval by transitive matching","authors":"A. Ulges, Christian Schulze","doi":"10.1145/1991996.1992043","DOIUrl":"https://doi.org/10.1145/1991996.1992043","url":null,"abstract":"We address scene-based image retrieval, the challenge of finding pictures taken at the same location as a given query image, whereas a key challenge lies in the fact that target images may show the same scene but different parts of it. To overcome this lack of direct correspondences with the query image, we study two strategies that exploit the structure of the targeted image collection: first, cluster matching, where pictures are grouped and retrieval is conducted on cluster level. Second, we propose a probabilistically motivated shortest path approach that determines retrieval scores based on the shortest path in a cost graph defined over the image collection. We evaluate both approaches on several datasets including indoor and outdoor locations, demonstrating that the accuracy of scene-based retrieval can be improved distinctly (by up to 40%), particularly by the shortest path approach.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128744933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Consistent visual words mining with adaptive sampling 基于自适应采样的一致性视觉词挖掘
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992045
Pierre Letessier, Olivier Buisson, A. Joly
{"title":"Consistent visual words mining with adaptive sampling","authors":"Pierre Letessier, Olivier Buisson, A. Joly","doi":"10.1145/1991996.1992045","DOIUrl":"https://doi.org/10.1145/1991996.1992045","url":null,"abstract":"State-of-the-art large-scale object retrieval systems usually combine efficient Bag-of-Words indexing models with a spatial verification re-ranking stage to improve query performance. In this paper we propose to directly discover spatially verified visual words as a batch process. Contrary to previous related methods based on feature sets hashing or clustering, we suggest not trading recall for efficiency by sticking on an accurate two-stage matching strategy. The problem then rather becomes a sampling issue: how to effectively and efficiently select relevant query regions while minimizing the number of tentative probes? We therefore introduce an adaptive weighted sampling scheme, starting with some prior distribution and iteratively converging to unvisited regions. Interestingly, the proposed paradigm is generalizable to any input prior distribution, including specific visual concept detectors or efficient hashing-based methods. We show in the experiments that the proposed method allows to discover highly interpretable visual words while providing excellent recall and image representativity.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126044519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Lookapp: interactive construction of web-based concept detectors Lookapp:基于web的概念检测器的交互式构建
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992062
Damian Borth, A. Ulges, T. Breuel
{"title":"Lookapp: interactive construction of web-based concept detectors","authors":"Damian Borth, A. Ulges, T. Breuel","doi":"10.1145/1991996.1992062","DOIUrl":"https://doi.org/10.1145/1991996.1992062","url":null,"abstract":"While online platforms like YouTube and Flickr do provide massive content for training of visual concept detectors, it remains a difficult challenge to retrieve the right training content from such platforms. In this technical demonstration we present lookapp, a system for the interactive construction of web-based concept detectors. It major features are an interactive \"concept-to-query\" mapping for training data acquisition and an efficient detector construction based on third party cloud computing services.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126410863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信