Proceedings of the 1st ACM International Conference on Multimedia Retrieval最新文献

筛选
英文 中文
Synthetically trained multi-view object class and viewpoint detection for advanced image retrieval 综合训练多视点目标分类和视点检测,用于高级图像检索
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1991999
Johannes Schels, Joerg Liebelt, K. Schertler, R. Lienhart
{"title":"Synthetically trained multi-view object class and viewpoint detection for advanced image retrieval","authors":"Johannes Schels, Joerg Liebelt, K. Schertler, R. Lienhart","doi":"10.1145/1991996.1991999","DOIUrl":"https://doi.org/10.1145/1991996.1991999","url":null,"abstract":"This paper proposes a novel approach to multi-view object class and viewpoint detection for the retrieval of images showing one or several objects from a given viewpoint, a viewpoint range or any viewpoint in image databases. All detectors are trained exclusively on a few synthetic 3D models without any manual bounding-box, viewpoint or part annotation, making object class and viewpoint detection a scalable learning task. Previous work on this topic relies on the detection of object parts for each individual viewpoint, ignoring the responses of part detectors specific to other viewpoints. Instead, we explicitly exploit appearance ambiguities caused by spurious detections of parts under more than one viewpoint by combining all detector responses in a joint spatial pyramid encoding. We achieve state-of-the-art results in multi-view object class detection and viewpoint determination on current benchmarking data sets and demonstrate increased robustness to partial occlusion.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123260225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Instant video summarization during shooting with mobile phone 手机拍摄时即时视频汇总
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992036
Xiao Zeng, Xiaohui Xie, Kongqiao Wang
{"title":"Instant video summarization during shooting with mobile phone","authors":"Xiao Zeng, Xiaohui Xie, Kongqiao Wang","doi":"10.1145/1991996.1992036","DOIUrl":"https://doi.org/10.1145/1991996.1992036","url":null,"abstract":"To facilitate review and management of home videos captured by mobile phones, we propose a novel instant summarization method which is applied while users are shooting. Segment boundaries and key frames are extracted without any delay, which means that the extracted frames strictly synchronize with the scene being captured. Partial-context is the major challenge of this method since only captured frames are available when summarization is applied. And limited calculation resource of mobile phones is another restricted condition in such a video analysis, especially when video compression is executed meanwhile. Several frame features are utilized for segmentation and key frame extraction; and an original key frame updating strategy is presented to optimize selected representative frames in such partial-context. Experimental results demonstrate that the proposed method is satisfactory in the aspects of low computation complexity, high effectiveness and good user experience.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114535943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
NV-Tree: nearest neighbors at the billion scale nv树:十亿尺度上的近邻
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992050
Herwig Lejsek, B. Jónsson, L. Amsaleg
{"title":"NV-Tree: nearest neighbors at the billion scale","authors":"Herwig Lejsek, B. Jónsson, L. Amsaleg","doi":"10.1145/1991996.1992050","DOIUrl":"https://doi.org/10.1145/1991996.1992050","url":null,"abstract":"This paper presents the NV-Tree (Nearest Vector Tree). It addresses the specific, yet important, problem of efficiently and effectively finding the approximate k-nearest neighbors within a collection of a few billion high-dimensional data points. The NV-Tree is a very compact index, as only six bytes are kept in the index for each high-dimensional descriptor. It thus scales extremely well when indexing large collections of high-dimensional descriptors. The NV-Tree efficiently produces results of good quality, even at such a large scale that the indices cannot be kept entirely in main memory any more. We demonstrate this with extensive experiments using a collection of 2.5 billion SIFT (Scale Invariant Feature Transform) descriptors.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123859239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Exploiting contextual spaces for image re-ranking and rank aggregation 利用上下文空间进行图像重新排序和秩聚合
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992009
D. C. G. Pedronette, R. Torres
{"title":"Exploiting contextual spaces for image re-ranking and rank aggregation","authors":"D. C. G. Pedronette, R. Torres","doi":"10.1145/1991996.1992009","DOIUrl":"https://doi.org/10.1145/1991996.1992009","url":null,"abstract":"The objective of Content-based Image Retrieval (CBIR) systems is to return the most similar images given an image query. In this scenario, accurately ranking collection images is of great relevance. In general, CBIR systems consider only pairwise image analysis, that is, compute similarity measures considering only pair of images, ignoring the rich information encoded in the relations among several images. This paper presents a novel re-ranking approach based on contextual spaces aiming to improve the effectiveness of CBIR tasks, by exploring relations among images. In our approach, information encoded in both distances among images and ranked lists computed by CBIR systems are used for analyzing contextual information. The re-ranking method can also be applied to other tasks, such as: (i) for combining ranked lists obtained by using different image descriptors (rank aggregation); and (ii) for combining post-processing methods. We conducted several experiments involving shape, color, and texture descriptors and comparisons to other post-processing methods. Experimental results demonstrate the effectiveness of our method.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117192496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
RetrievalLab: a programming tool for content based retrieval RetrievalLab:用于基于内容的检索的编程工具
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992067
Ard A. J. Oerlemans, M. Lew
{"title":"RetrievalLab: a programming tool for content based retrieval","authors":"Ard A. J. Oerlemans, M. Lew","doi":"10.1145/1991996.1992067","DOIUrl":"https://doi.org/10.1145/1991996.1992067","url":null,"abstract":"In this paper we present RetrievalLab, a content based retrieval tool that was designed for both educational and research purposes. It is a tool to facilitate the testing of new features, segmentations, machine learning approaches, and evaluation methods, by presenting a Matlab-like programming interface which illuminates the fundamental processes and algorithms in content based retrieval.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115333567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Component-based track inspection using machine-vision technology 采用机器视觉技术的基于组件的轨道检测
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992056
Y. Li, Charles Otto, N. Haas, Yuichi Fujiki, Sharath Pankanti
{"title":"Component-based track inspection using machine-vision technology","authors":"Y. Li, Charles Otto, N. Haas, Yuichi Fujiki, Sharath Pankanti","doi":"10.1145/1991996.1992056","DOIUrl":"https://doi.org/10.1145/1991996.1992056","url":null,"abstract":"In this paper, we present our latest research engagement with a railroad company to apply machine vision technologies to automate the inspection and condition monitoring of railroad tracks. Specifically, we have proposed a complete architecture including imaging setup for capturing multiple video streams, important rail component detection such as tie plate, spike, anchor and joint bar bolt, defect identification such as raised spikes, defect severity analysis and temporal condition analysis, and long-term predictive assessment. This paper will particularly present various video analytics that we have developed to detect rail components, which form the building block of the entire framework. Our preliminary performance study has achieved an average of 98.2% detection rate, 1.57% false positive rate and 1.78% false negative rate on the component detection. Finally, with the lack of sufficient representative data and annotations to evaluate system performance on exception detection at both sequence and compliance levels, we proposed a mathematical modeling approach to calculate the probabilities of detecting such exceptions. Such analysis shows that there is still big room for us to improve our approaches in order to achieve desired false positive rate and miss detection rate at the sequence level.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122363190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Indexing the signature quadratic form distance for efficient content-based multimedia retrieval 为有效的基于内容的多媒体检索索引签名二次形式距离
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992020
C. Beecks, Jakub Lokoč, T. Seidl, T. Skopal
{"title":"Indexing the signature quadratic form distance for efficient content-based multimedia retrieval","authors":"C. Beecks, Jakub Lokoč, T. Seidl, T. Skopal","doi":"10.1145/1991996.1992020","DOIUrl":"https://doi.org/10.1145/1991996.1992020","url":null,"abstract":"The Signature Quadratic Form Distance has been introduced as an adaptive similarity measure coping with flexible content representations of various multimedia data. Although the Signature Quadratic Form Distance has shown good retrieval performance with respect to their qualities of effectiveness and efficiency, its applicability to index structures remains a challenging issue due to its dynamic nature. In this paper, we investigate the indexability of the Signature Quadratic Form Distance regarding metric access methods. We show how the distance's inherent parameters determine the indexability and analyze the relationship between effectiveness and efficiency on numerous image databases.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127743558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Fusing heterogeneous modalities for video and image re-ranking 融合异构模式的视频和图像重排序
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992011
Hung-Khoon Tan, C. Ngo
{"title":"Fusing heterogeneous modalities for video and image re-ranking","authors":"Hung-Khoon Tan, C. Ngo","doi":"10.1145/1991996.1992011","DOIUrl":"https://doi.org/10.1145/1991996.1992011","url":null,"abstract":"Multimedia documents in popular image and video sharing websites such as Flickr and Youtube are heterogeneous documents with diverse ways of representations and rich user-supplied information. In this paper, we investigate how the agreement among heterogeneous modalities can be exploited to guide data fusion. The problem of fusion is cast as the simultaneous mining of agreement from different modalities and adaptation of fusion weights to construct a fused graph from these modalities. An iterative framework based on agreement-fusion optimization is thus proposed. We plug in two well-known algorithms: random walk and semi-supervised learning to this framework to illustrate the idea of how agreement (conflict) is incorporated (compromised) in the case of uniform and adaptive fusion. Experimental results on web video and image re-ranking demonstrate that, by proper fusion strategy rather than simple linear fusion, performance improvement on search can generally be expected.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132250338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Spatial codebooks for image categorization 用于图像分类的空间码本
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992046
Eugene Mbanya, S. Gerke, P. Ndjiki-Nya
{"title":"Spatial codebooks for image categorization","authors":"Eugene Mbanya, S. Gerke, P. Ndjiki-Nya","doi":"10.1145/1991996.1992046","DOIUrl":"https://doi.org/10.1145/1991996.1992046","url":null,"abstract":"Currently, bag-of-words approaches for image categorization are very popular due to their relative simplicity, robustness and high efficiency. However, they lack the ability to represent the spatial composition of an image. This drawback has been addressed by several approaches, with spatial pyramids being the most popular. Spatial pyramids divide an image into smaller blocks, resulting in a feature vector for each block of the image. The feature vectors for these blocks are concatenated to form the feature vector of the whole image. This leads to an increase in dimension of the whole image's feature vector by a factor corresponding to the number of blocks the image is divided into. Consequently, this causes an increase in computation time proportional to the number of blocks. We propose an extension of the image feature vector by spatial features, which results in a descriptor of similar size as in the standard bag-of-words approach. The classification performance however is similar to those of spatial pyramids which use a feature vector of significantly larger size and therefore are more computationally expensive.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130011706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Lost in binarization: query-adaptive ranking for similar image search with compact codes 在二值化中丢失:使用紧凑代码搜索相似图像的查询自适应排序
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992012
Yu-Gang Jiang, Jun Wang, Shih-Fu Chang
{"title":"Lost in binarization: query-adaptive ranking for similar image search with compact codes","authors":"Yu-Gang Jiang, Jun Wang, Shih-Fu Chang","doi":"10.1145/1991996.1992012","DOIUrl":"https://doi.org/10.1145/1991996.1992012","url":null,"abstract":"With the proliferation of images on the Web, fast search of visually similar images has attracted significant attention. State-of-the-art techniques often embed high-dimensional visual features into low-dimensional Hamming space, where search can be performed in real-time based on Hamming distance of compact binary codes. Unlike traditional metrics (e.g., Euclidean) of raw image features that produce continuous distance, the Hamming distances are discrete integer values. In practice, there are often a large number of images sharing equal Hamming distances to a query, resulting in a critical issue for image search where ranking is very important. In this paper, we propose a novel approach that facilitates query-adaptive ranking for the images with equal Hamming distance. We achieve this goal by firstly offline learning bit weights of the binary codes for a diverse set of predefined semantic concept classes. The weight learning process is formulated as a quadratic programming problem that minimizes intra-class distance while preserving interclass relationship in the original raw image feature space. Query-adaptive weights are then rapidly computed by evaluating the proximity between a query and the concept categories. With the adaptive bit weights, the returned images can be ordered by weighted Hamming distance at a finer-grained binary code level rather than at the original integer Hamming distance level. Experimental results on a Flickr image dataset show clear improvements from our query-adaptive ranking approach.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123535476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信