2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)最新文献

筛选
英文 中文
Uploader models for video concept detection 视频概念检测的上传模型
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-07-10 DOI: 10.1109/CBMI.2014.6849847
B. Mérialdo, U. Niaz
{"title":"Uploader models for video concept detection","authors":"B. Mérialdo, U. Niaz","doi":"10.1109/CBMI.2014.6849847","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849847","url":null,"abstract":"In video indexing, it has been noticed that a simple uploader model was able to improve the MAP of concept detection in the TRECVID Semantic Concept Indexing (SIN) task. In this paper, we explore this idea further by comparing different types of uploader models and different types of score/rank distribution. We evaluate the performance of these combinations on the best SIN 2012 runs, and explore the impact of their parameters. We observe that the improvement is generally lower for the best runs than for the weaker runs. We also observe that tuning the models for each concept independently produces a much more significant improvement.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131748087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Annotation of still images by multiple visual concepts 多重视觉概念对静止图像的注释
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-06-18 DOI: 10.1109/CBMI.2014.6849844
Abdelkader Hamadi, P. Mulhem, G. Quénot
{"title":"Annotation of still images by multiple visual concepts","authors":"Abdelkader Hamadi, P. Mulhem, G. Quénot","doi":"10.1109/CBMI.2014.6849844","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849844","url":null,"abstract":"The automatic indexing of images and videos is a highly relevant and important research area in the field of multimedia information retrieval. The difficulty of this task is no longer something to prove. The majority of the efforts of the research community have been focused in the past on the detection of single concepts in images/videos, which is already a hard task. With the evolution of the information retrieval systems, users needs are more abstract, and lead to a larger number of words composing the queries. It is sensible to think about indexing multimedia documents by more than one concept, to help retrieval systems to answer such complex queries. Few studies addressed specifically the problem of detecting multiple concepts (multi-concept) in images and videos, most of them concern the detection of concept pairs. These studies showed that such challenge is even greater than the one of single concept detection. In this work, we address this problematic of mult-concept detection in still images. Two types of approaches are considered : 1) building models per multi-concept and 2) fusion of single concepts detectors. We conducted our evaluation on PASCAL VOC'12 collection regarding the detection of pairs and triplets of concepts. Our results show that the two types of approaches give globally comparable results, but they differ for specific kinds of pairs/triplets.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115069807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Searching images with MPEG-7 (& MPEG-7-like) Powered Localized dEscriptors: The SIMPLE answer to effective Content Based Image Retrieval 用MPEG-7(和类似MPEG-7)驱动的本地化描述符搜索图像:有效的基于内容的图像检索的简单答案
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-06-18 DOI: 10.1109/CBMI.2014.6849821
C. Iakovidou, N. Anagnostopoulos, Athanasios Ch. Kapoutsis, Y. Boutalis, S. Chatzichristofis
{"title":"Searching images with MPEG-7 (& MPEG-7-like) Powered Localized dEscriptors: The SIMPLE answer to effective Content Based Image Retrieval","authors":"C. Iakovidou, N. Anagnostopoulos, Athanasios Ch. Kapoutsis, Y. Boutalis, S. Chatzichristofis","doi":"10.1109/CBMI.2014.6849821","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849821","url":null,"abstract":"In this paper we propose and evaluate a new technique that localizes the description ability of the well established MPEG-7 and MPEG-7-like global descriptors. We employ the SURF detector to define salient image patches of blob-like textures and use the MPEG-7 Scalable Color (SC), Color Layout (CL) and Edge Histogram (EH) descriptors and the global MPEG-7-like Color and Edge Directivity Descriptor (CEDD), to produce the final local features' vectors. In order to test the new descriptors in the most straightforward fashion, we use the Bag-Of-Visual-Words framework for indexing and retrieval. The experimental results conducted on two different benchmark databases with varying codebook sizes, revealed an astonishing boost in the retrieval performance of the proposed descriptors compared both to their own performance (in their original form) and to other state-of-the-art methods of local and global descriptors. Open-source implementation of the proposed descriptors is available in c#, Java and MATLAB.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123640862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Scalable video summarization of cultural video documents in cross-media space based on data cube approach 基于数据立方体方法的跨媒体空间文化视频文档的可扩展视频摘要
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-06-18 DOI: 10.1109/CBMI.2014.6849824
Karina Ruby Perez-Daniel, M. Nakano-Miyatake, J. Benois-Pineau, S. Maabout, G. Sargent
{"title":"Scalable video summarization of cultural video documents in cross-media space based on data cube approach","authors":"Karina Ruby Perez-Daniel, M. Nakano-Miyatake, J. Benois-Pineau, S. Maabout, G. Sargent","doi":"10.1109/CBMI.2014.6849824","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849824","url":null,"abstract":"Video summarization has been a core problem to manage the growing amount of content in multimedia databases. An efficient video summary should display an overview of the video content and most of existing approaches fulfil this goal. However the information does not allow user to get all details of interest selectively and progressively. This paper proposes a scalable video summarization approach which provides multiple views and levels of details. Our method relies on the usage of cross media space and consensus clustering method. A video document is modelled as a data cube where the level of details is refined over nonconsensual features of the space. The method is designed for weakly structured content such as cultural documentaries and was tested on the INA corpus of cultural archives.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125868252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A robust audio fingerprinting method for content-based copy detection 一种基于内容的音频指纹检测鲁棒方法
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-06-18 DOI: 10.1109/CBMI.2014.6849814
Chahid Ouali, P. Dumouchel, Vishwa Gupta
{"title":"A robust audio fingerprinting method for content-based copy detection","authors":"Chahid Ouali, P. Dumouchel, Vishwa Gupta","doi":"10.1109/CBMI.2014.6849814","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849814","url":null,"abstract":"This paper presents a novel audio fingerprinting method that is highly robust to a variety of audio distortions. It is based on unconventional audio fingerprints generation scheme. The robustness is achieved by generating different versions of the spectrogram matrix of the audio signal by using a threshold based on the average of the spectral values to prune this matrix. We transform each version of this pruned spectrogram matrix into a 2-D binary image. Multiple 2-D images suppress noise to a varying degree. This varying degree of noise suppression improves likelihood of one of the images matching a reference image. To speed up matching, we convert each image into an n-dimensional vector, and perform a nearest neighbor search based on this n-dimensional vector. We test this method on TRECVID 2010 content-based copy detection evaluation dataset. Experimental results show the effectiveness of such fingerprints even when the audio is distorted. We compare the proposed method to a state-of-the-art audio copy detection system. Results of this comparison show that our method achieves an improvement of 22% in localization accuracy, and lowers minimal normalized detection cost rate (min NDCR) by half for audio transformations T1 and T2.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129481398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Online multimodal matrix factorization for human action video indexing 基于在线多模态矩阵分解的人体动作视频索引
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-06-18 DOI: 10.1109/CBMI.2014.6849823
F. Páez, Jorge A. Vanegas, F. González
{"title":"Online multimodal matrix factorization for human action video indexing","authors":"F. Páez, Jorge A. Vanegas, F. González","doi":"10.1109/CBMI.2014.6849823","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849823","url":null,"abstract":"This paper addresses the problem of searching for videos containing instances of specific human actions. The proposed strategy builds a multimodal latent space representation where both visual content and annotations are simultaneously mapped. The hypothesis behind the method is that such a latent space yields better results when built from multiple data modalities. The semantic embedding is learned using matrix factorization through stochastic gradient descent, which makes it suitable to deal with large-scale collections. The method is evaluated on a large-scale human action video dataset with three modalities corresponding to action labels, action attributes and visual features. The evaluation is based on a query-by-example strategy, where a sample video is used as input to the system. A retrieved video is considered relevant if it contains an instance of the same human action present in the query. Experimental results show that the learned multimodal latent semantic representation produces improved performance when compared with an exclusively visual representation.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121087815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Ultrasound image processing based on machine learning for the fully automatic evaluation of the Carotid Intima-Media Thickness 基于机器学习的超声图像处理用于颈动脉内膜-中膜厚度的全自动评估
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-06-18 DOI: 10.1109/CBMI.2014.6849839
R. Menchón-Lara, J. Sancho-Gómez
{"title":"Ultrasound image processing based on machine learning for the fully automatic evaluation of the Carotid Intima-Media Thickness","authors":"R. Menchón-Lara, J. Sancho-Gómez","doi":"10.1109/CBMI.2014.6849839","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849839","url":null,"abstract":"Atherosclerosis is responsible for a large proportion of cardiovascular diseases (CVD), which are the leading cause of death in the world. The atherosclerotic process, mainly affecting the medium- and large-size arteries, is a degenerative condition that causes thickening and the reduction of elasticity in the blood vessels. The Intima-Media Thickness (IMT) of the Common Carotid Artery (CCA) is a reliable early indicator of atherosclerosis. Usually, it is manually measured by marking pairs of points on a B-mode ultrasound scan image of the CCA. This paper proposes an automatic image segmentation procedure for the measurement of the IMT, avoiding the user dependence and the inter-rater variability. In particular, Radial Basis Function (RBF) Networks are designed and trained by means of the Optimally Pruned-Extreme Learning Machine (OP-ELM) algorithm to classify pixels from a given ultrasound image, allowing the extraction of IMT boundaries. The suggested approach has been validated on a set of 25 ultrasound images by comparing the automatic segmentations with manual tracings.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127461359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Inverse square rank fusion for multimodal search 多模态搜索的逆平方秩融合
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-06-18 DOI: 10.1109/CBMI.2014.6849825
André Mourão, Flávio Martins, João Magalhães
{"title":"Inverse square rank fusion for multimodal search","authors":"André Mourão, Flávio Martins, João Magalhães","doi":"10.1109/CBMI.2014.6849825","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849825","url":null,"abstract":"Rank fusion is the task of combining multiple ranked document lists (ranks) into a single ranked list. It is a late fusion approach designed to improve the rankings produced by individual systems. Rank fusion techniques have been applied throughout multiple domains: e.g. combining results from multiple retrieval functions, or multimodal search where several feature spaces are common. In this paper, we present the Inverse Square Rank fusion method family, a set of novel fully unsupervised rank fusion methods based on quadratic decay and on logarithmic document frequency normalization. Our experiments created with standard Information Retrieval datasets (image and text fusion) and image datasets (image features fusion), show that ISR outperforms existing rank fusion algorithms. Thus, the proposed technique has comparable or better performance than existing state-of-the-art approaches, while maintaining a low computational complexity and avoiding the need for document scores or training data.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122794753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Bag of morphological words for content-based geographical retrieval 用于基于内容的地理检索的形态词包
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-06-18 DOI: 10.1109/CBMI.2014.6849837
E. Aptoula
{"title":"Bag of morphological words for content-based geographical retrieval","authors":"E. Aptoula","doi":"10.1109/CBMI.2014.6849837","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849837","url":null,"abstract":"Placed in the context of geographical content-based image retrieval, in this paper we explore the description potential of morphological texture descriptors when combined with the popular bag-of-visual-words paradigm. In particular, we adapt existing global morphological texture descriptors, so that they are computed within local sub-windows and then form a vocabulary of “visual morphological words” through clustering. The resulting image features, are thus visual word histograms and are evaluated using the UC Merced Land Use-Land Cover dataset. Moreover, the local approach under study is compared against alternative global and local descriptors across a variety of settings. Despite being one of the initial attempts at localized morphological content description, the retrieval scores indicate that vocabulary based morphological content description possesses a significant discriminatory potential.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129569178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Automatic object annotation from weakly labeled data with latent structured SVM 基于潜在结构化支持向量机的弱标记数据对象自动标注
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2014-06-18 DOI: 10.1109/CBMI.2014.6849838
Christian X. Ries, Fabian Richter, Stefan Romberg, R. Lienhart
{"title":"Automatic object annotation from weakly labeled data with latent structured SVM","authors":"Christian X. Ries, Fabian Richter, Stefan Romberg, R. Lienhart","doi":"10.1109/CBMI.2014.6849838","DOIUrl":"https://doi.org/10.1109/CBMI.2014.6849838","url":null,"abstract":"In this paper we present an approach to automatic object annotation. We are given a set of positive images which all contain a certain object and our goal is to automatically determine the position of said object in each image. Our approach first applies a heuristic to identify initial bounding boxes based on color and gradient features. This heuristic is based on image and feature statistics. Then, the initial boxes are refined by a latent structured SVM training algorithm which is based on the CCCP training algorithm. We show that our approach outperforms previous work on multiple datasets.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114408563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信