Proceedings of the 1st ACM International Conference on Multimedia Retrieval最新文献

筛选
英文 中文
High-level event detection system based on discriminant visual concepts 基于判别视觉概念的高级事件检测系统
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992064
I. Tsampoulatidis, Nikolaos Gkalelis, A. Dimou, V. Mezaris, Y. Kompatsiaris
{"title":"High-level event detection system based on discriminant visual concepts","authors":"I. Tsampoulatidis, Nikolaos Gkalelis, A. Dimou, V. Mezaris, Y. Kompatsiaris","doi":"10.1145/1991996.1992064","DOIUrl":"https://doi.org/10.1145/1991996.1992064","url":null,"abstract":"This paper demonstrates a new approach to detecting high-level events that may be depicted in images or video frames. Given a non-annotated content item, a large number of previously trained visual concept detectors are applied to it and their responses are used for representing the content item with a model vector in a high-dimensional concept space. Subsequently, an improved subclass discriminant analysis method is used for identifying a concept subspace within the aforementioned concept space, that is most appropriate for detecting and recognizing the target high-level events. In this subspace, the nearest neighbor rule is used for comparing the non-annotated content item with a few known example instances of the target events. The high-level events used as target events in the present version of the system are those defined for the TRECVID 2010 Multimedia Event Detection (MED) task.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127661802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
City exploration by use of spatio-temporal analysis and clustering of user contributed photos 利用用户贡献照片的时空分析和聚类进行城市探索
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992061
S. Papadopoulos, Christos Zigkolis, S. Kapiris, Y. Kompatsiaris, A. Vakali
{"title":"City exploration by use of spatio-temporal analysis and clustering of user contributed photos","authors":"S. Papadopoulos, Christos Zigkolis, S. Kapiris, Y. Kompatsiaris, A. Vakali","doi":"10.1145/1991996.1992061","DOIUrl":"https://doi.org/10.1145/1991996.1992061","url":null,"abstract":"We present a technical demonstration of an online city exploration application that helps users identify interesting spots in a city by use of spatio-temporal analysis and clustering of user contributed photos. Our framework analyzes the spatial distribution of large city-centered collections of user contributed photos at different time scales in order to index the most popular spots of a city in a time-aware manner. Subsequently, the photo sets belonging to the same spatiotemporal context are clustered in order to extract representative photos for each spot. The resulting application enables users to obtain flexible summaries of the most important spots in a city given a temporal slice (time of the day, month, season). The demonstration will be based on a photo dataset covering major European cities.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130899422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Consumer video understanding: a benchmark database and an evaluation of human and machine performance 消费者视频理解:一个基准数据库和人类和机器性能的评估
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992025
Yu-Gang Jiang, Guangnan Ye, Shih-Fu Chang, D. Ellis, A. Loui
{"title":"Consumer video understanding: a benchmark database and an evaluation of human and machine performance","authors":"Yu-Gang Jiang, Guangnan Ye, Shih-Fu Chang, D. Ellis, A. Loui","doi":"10.1145/1991996.1992025","DOIUrl":"https://doi.org/10.1145/1991996.1992025","url":null,"abstract":"Recognizing visual content in unconstrained videos has become a very important problem for many applications. Existing corpora for video analysis lack scale and/or content diversity, and thus limited the needed progress in this critical area. In this paper, we describe and release a new database called CCV, containing 9,317 web videos over 20 semantic categories, including events like \"baseball\" and \"parade\", scenes like \"beach\", and objects like \"cat\". The database was collected with extra care to ensure relevance to consumer interest and originality of video content without post-editing. Such videos typically have very little textual annotation and thus can benefit from the development of automatic content analysis techniques. We used Amazon MTurk platform to perform manual annotation, and studied the behaviors and performance of human annotators on MTurk. We also compared the abilities in understanding consumer video content by humans and machines. For the latter, we implemented automatic classifiers using state-of-the-art multi-modal approach that achieved top performance in recent TRECVID multimedia event detection task. Results confirmed classifiers fusing audio and video features significantly outperform single-modality solutions. We also found that humans are much better at understanding categories of nonrigid objects such as \"cat\", while current automatic techniques are relatively close to humans in recognizing categories that have distinctive background scenes or audio patterns.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133035039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 291
Learning reconfigurable hashing for diverse semantics 学习不同语义的可重构哈希
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992003
Yadong Mu, Xiangyu Chen, Tat-Seng Chua, Shuicheng Yan
{"title":"Learning reconfigurable hashing for diverse semantics","authors":"Yadong Mu, Xiangyu Chen, Tat-Seng Chua, Shuicheng Yan","doi":"10.1145/1991996.1992003","DOIUrl":"https://doi.org/10.1145/1991996.1992003","url":null,"abstract":"In recent years, locality-sensitive hashing (LSH) has gained plenty of attention from both the multimedia and computer vision communities due to its empirical success and theoretic guarantee in large-scale visual indexing and retrieval. Conventional LSH algorithms are designated either for generic metrics such as Cosine similarity, ℓ2-norm and Jaccard index, or for the metrics learned from user-supplied supervision information. The common drawbacks of existing algorithms are their incapability to be adapted to metric changes, along with the inefficacy when handling diverse semantics (e. g., more than 1K different categories in the well-known ImageNet database). For the metrics underlying the hashing structure, even tiny changes tend to nullify previous indexing efforts, which motivates our proposed framework towards \"reconfigurable hashing\". The basic idea is to maintain a large pool of over-complete hashing functions embedded in the ambient feature space, which serves as the common infrastructure of high-level diverse semantics. At the runtime, the algorithm dynamically selects relevant hashing bits by maximizing the consistency to specific semantics-induced metric, thereby achieving reusability of the pre-computed hashing bits. Such a reusable scheme especially benefits the indexing and retrieval of large-scale dataset, since it facilitates one-off indexing rather than continuous computation-intensive maintenance towards metric adaptation. We propose a sequential bit-selection algorithm based on local consistency and global regularization. Extensive studies are conducted on large-scale image benchmarks to comparatively investigate the performance of different strategies on reconfigurable hashing. Despite the vast literature on hashing, to our best knowledge rare endeavors have been spent toward the reusability of hashing structures in large-scale datasets.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125227682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Person-specific age estimation under ranking framework 排名框架下的个人年龄估计
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992034
Yong Ma, T. Xiong, Y. Zou, Kongqiao Wang
{"title":"Person-specific age estimation under ranking framework","authors":"Yong Ma, T. Xiong, Y. Zou, Kongqiao Wang","doi":"10.1145/1991996.1992034","DOIUrl":"https://doi.org/10.1145/1991996.1992034","url":null,"abstract":"Different from traditional age estimation methods under classification or regression frameworks, this paper proposes a novel person-specific age estimation method under ranking framework. The basic idea is to consider the aging process as a personal age-ranked image sequences and extract the relevant information from this sequences. The estimation of age for an unknown face image is determined by first utilizing face recognition to find the persons in template sets who looks similar to the unseen person, then estimating the ranking order of the unseen person in corresponding person-specific image sequences, lastly mapping and fusing the rank order to its real age. Under this framework, our proposed system not only can estimate the correct the age orders of pairs of faces naturally but also can estimate the real age accurately. The proposed method has shown encouraging performance in the comparative experiments either as an age ranker or as an accurate age estimator and the experiment also proved the validity of the above assumption.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116437770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
A color-action perceptual approach to the classification of animated movies 用色彩-动作感知方法对动画电影进行分类
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992006
B. Ionescu, C. Vertan, P. Lambert, A. Benoît
{"title":"A color-action perceptual approach to the classification of animated movies","authors":"B. Ionescu, C. Vertan, P. Lambert, A. Benoît","doi":"10.1145/1991996.1992006","DOIUrl":"https://doi.org/10.1145/1991996.1992006","url":null,"abstract":"We address a particular case of video genre classification, namely the classification of animated movies. This task is achieved using two categories of content descriptors, temporal and color based, which are adapted to this particular content. Temporal descriptors, like rhythm or action, are quantifying the perception of the action content at different levels. Color descriptors are determined using color perception which is quantified in terms of statistics of color distribution, elementary hues, color properties (e.g. amount of light colors, cold colors, etc.) and color relationship. The potential of the proposed descriptors to the classification task has been proved through experimental tests conducted on more than 159 hours of video footage. Despite the high diversity of the video material, the proposed descriptors achieve an average precision and recall ratios up to 90% and 92%, respectively, and a global correct detection ratio up to 92%.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114391987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Image modality classification: a late fusion method based on confidence indicator and closeness matrix 图像模态分类:基于置信度指标和接近度矩阵的后期融合方法
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992051
Xingzhi Sun, L. Gong, A. Natsev, Xiaofei Teng, Li-Ying Tian, Tao Wang, Yue Pan
{"title":"Image modality classification: a late fusion method based on confidence indicator and closeness matrix","authors":"Xingzhi Sun, L. Gong, A. Natsev, Xiaofei Teng, Li-Ying Tian, Tao Wang, Yue Pan","doi":"10.1145/1991996.1992051","DOIUrl":"https://doi.org/10.1145/1991996.1992051","url":null,"abstract":"Automatic recognition or classification of medical image modality can provide valuable information for medical image retrieval and analysis. In this paper, we discuss an application of SVM ensemble classifiers to the problem, and explore a confidence indicator based late fusion method to resolve ambiguity across competing classes. Using a matrix of closeness and a set of additional fusion rules, the proposed method improves the classification performance by only subjecting likely misclassified samples to a text-based classifier followed by additional fusion of both image-based classification and text-based classification results. An empirical evaluation using standard ImageClef2010 Medical Retrieval data show very promising performance for the proposed approach.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132591068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Embracing semantics in zoomable user interface 在可缩放的用户界面中包含语义
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992058
Daniele Panza, A. Vitali, Alexandro Sentinelli, Luca Celetto
{"title":"Embracing semantics in zoomable user interface","authors":"Daniele Panza, A. Vitali, Alexandro Sentinelli, Luca Celetto","doi":"10.1145/1991996.1992058","DOIUrl":"https://doi.org/10.1145/1991996.1992058","url":null,"abstract":"In the Information Retrieval (IR) field the majority of research up to now approaches all questions in terms of semantic tagging, indexing and feature extraction. Although this is a fundamental step to design any IR system, we believe that also an efficient human machine interface (HMI) can significantly improve the retrieval rate success at the end-user side. In particular, Zoomable User Interfaces (ZUIs) are becoming popular in many applications and devices, thanks to their advantages in terms of usability and appeal to the public. In this paper we propose what we called a \"Semantic ZUI\", a traditional ZUI enriched with semantic engines that work behind the scenes and design features in order to provide a seamless experience of filesystem browsing. Since users are becoming both consumers and producers of huge amount of untagged content we explore the ZUI potentials by introducing features that focus on multimedia content. We set up the first version of a fully working demo environment with the aim to stimulate the debate in the IR community from the end-user experience point of view.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133249209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diversity ranking for video retrieval from a broadcaster archive 从广播公司档案中检索视频的多样性排序
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992052
Xavier Giró-i-Nieto, Monica Alfaro, F. Marqués
{"title":"Diversity ranking for video retrieval from a broadcaster archive","authors":"Xavier Giró-i-Nieto, Monica Alfaro, F. Marqués","doi":"10.1145/1991996.1992052","DOIUrl":"https://doi.org/10.1145/1991996.1992052","url":null,"abstract":"Video retrieval through text queries is a very common practice in broadcaster archives. The query keywords are compared to the metadata labels that documentalists have previously associated to the video assets. This paper focuses on a ranking strategy to obtain more relevant keyframes among the top hits of the results ranked lists but, at the same time, keeping a diversity of video assets. Previous solutions based on a random walk over a visual similarity graph have been modified to increase the asset diversity by filtering the edges between keyframes depending on their asset. The random walk algorithm is applied separately for ever visual feature to avoid any normalization issue between visual similarity metrics. Finally, this work evaluates performance with two separate metrics: the relevance is measured by the Average Precision and the diversity is assessed by the Average Diversity, a new metric presented in this work.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131605458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Accurate content-based video copy detection with efficient feature indexing 准确的基于内容的视频复制检测与高效的特征索引
Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI: 10.1145/1991996.1992015
Yusuke Uchida, M. Agrawal, S. Sakazawa
{"title":"Accurate content-based video copy detection with efficient feature indexing","authors":"Yusuke Uchida, M. Agrawal, S. Sakazawa","doi":"10.1145/1991996.1992015","DOIUrl":"https://doi.org/10.1145/1991996.1992015","url":null,"abstract":"We describe an accurate content-based copy detection system that uses both local and global visual features to ensure robustness. Our system advances state-of-the-art techniques in four key directions. (1) Multiple-codebook-based product quantization: conventional product quantization methods encode feature vectors using a single codebook, resulting in large quantization error. We propose a novel codebook generation method for an arbitrary number of codebooks. (2) Handling of temporal burstiness: for a stationary scene, once a query feature matches incorrectly, the match continues in successive frames, resulting in a high false-alarm rate. We present a temporal-burstiness-aware scoring method that reduces the impact from similar features, thereby reducing false alarms. (3) Densely sampled SIFT descriptors: conventional global features suffer from a lack of distinctiveness and invariance to non-photometric transformations. Our densely sampled global SIFT features are more discriminative and robust against logo or pattern insertions. (4) Bigram- and multiple-assignment-based indexing for global features: we extract two SIFT descriptors from each location, which makes them more distinctive. To improve recall, we propose multiple assignments on both the query and reference sides. Performance evaluation on the TRECVID 2009 dataset indicates that both local and global approaches outperform conventional schemes. Furthermore, the integration of these two approaches achieves a three-fold reduction in the error rate when compared with the best performance reported in the TRECVID 2009 workshop.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"653 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122962054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信