2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)最新文献_第4页

Investigating segment-based query expansion for user-generated spoken content retrieval 为用户生成的口语内容检索研究基于段的查询扩展

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500268

Ahmad Khwileh, G. Jones

{"title":"Investigating segment-based query expansion for user-generated spoken content retrieval","authors":"Ahmad Khwileh, G. Jones","doi":"10.1109/CBMI.2016.7500268","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500268","url":null,"abstract":"The very rapid growth in user-generated social multimedia content on online platforms is creating new challenges for search technologies. A significant issue for search of this type of content is its highly variable form and quality. This is compounded by the standard information retrieval (IR) problem of mismatch between search queries and target items. Query Expansion (QE) has been shown to be an effect technique to improve IR effectiveness for multiple search tasks. In QE, words from a number of relevant or assumed relevant top ranked documents from an initial search are added to the initial search query to enrich it before carrying out a further search operation. In this work, we investigate the application of QE methods for searching social multimedia content. In particular we focus on social multimedia content where the information is primarily in the audio stream. To address the challenge of content variability, we introduce three speech segment-based methods for QE using: Semantic segmentation, Discourse segmentation and Window-Based. Our experimental investigation illustrates the superiority of these segment-based methods in comparison to a standard full document QE method for a version of the MediaEval 2012 Search task newly extended as an adhoc search task.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123233086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Indexing multimedia learning materials in ultimate course search 在终极课程检索中索引多媒体学习资料

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500250

Sheetal Rajgure, Krithika Raghavan, Vincent Oria, Reza Curtmola, Edina Renfro-Michel, P. Gouton

{"title":"Indexing multimedia learning materials in ultimate course search","authors":"Sheetal Rajgure, Krithika Raghavan, Vincent Oria, Reza Curtmola, Edina Renfro-Michel, P. Gouton","doi":"10.1109/CBMI.2016.7500250","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500250","url":null,"abstract":"Multimedia is the main support for online learning materials and the size of multimedia learning materials is growing with the popularity of online programs offered by Universities. Ultimate Course Search (UCS) is a tool that aims to provide efficient search of course materials. UCS integrates slides, lecture videos and textbook content into a single platform with search capabilities. The keywords extracted from the textbook index and the power-point slides are the basis of the indexing scheme. The slides are indexed on the keywords and the videos are indexed on the slides. The correspondence between the slides and video segments is established using the meta-data provided by the video recording software when available and by image processing techniques. Unlike a classical document search in which the user would be looking where the keywords are found, the search of learning materials in UCS is different because the user is also looking where the search words are better explained. We propose a keyword appearance prioritized ranking mechanism that integrates into the ranking, the location information of the keyword from the slides.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128339689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Model-based video content representation 基于模型的视频内容表示

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500254

Lukas Diem, M. Zaharieva

引用次数: 0

Filterbank coefficients selection for segmentation in singer turns 歌手圈分割的滤波器组系数选择

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500273

Marwa Thlithi, J. Pinquier, Thomas Pellegrini, R. André-Obrecht

引用次数: 0

Large scale content-based video retrieval with LIvRE 使用LIvRE进行大规模基于内容的视频检索

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500266

Gabriel de Oliveira Barra, M. Lux, Xavier Giró-i-Nieto

引用次数: 22

A hybrid graph-based and non-linear late fusion approach for multimedia retrieval 一种基于图形和非线性的多媒体检索后期融合方法

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500252

Ilias Gialampoukidis, A. Moumtzidou, Dimitris Liparas, S. Vrochidis, Y. Kompatsiaris

{"title":"A hybrid graph-based and non-linear late fusion approach for multimedia retrieval","authors":"Ilias Gialampoukidis, A. Moumtzidou, Dimitris Liparas, S. Vrochidis, Y. Kompatsiaris","doi":"10.1109/CBMI.2016.7500252","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500252","url":null,"abstract":"Nowadays, multimedia retrieval has become a task of high importance, due to the need for efficient and fast access to very large and heterogeneous multimedia collections. An interesting challenge within the aforementioned task is the efficient combination of different modalities in a multimedia object and especially the fusion between textual and visual information. The fusion of multiple modalities for retrieval in an unsupervised way has been mostly based on early, weighted linear, graph-based and diffusion-based techniques. In contrast, we present a strategy for fusing textual and visual modalities, through the combination of a non-linear fusion model and a graph-based late fusion approach. The fusion strategy is based on the construction of a uniform multimodal contextual similarity matrix and the non-linear combination of relevance scores from query-based similarity vectors. The proposed late fusion approach is evaluated in the multimedia retrieval task, by applying it to two multimedia collections, namely the WIKI11 and IAPR-TC12. The experimental results indicate its superiority over the baseline method in terms of Mean Average Precision for both considered datasets.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128984766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Exploring an unsupervised, language independent, spoken document retrieval system 探索无监督、语言独立、口语文档检索系统

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500262

Alexandru Caranica, H. Cucu, Andi Buzo

{"title":"Exploring an unsupervised, language independent, spoken document retrieval system","authors":"Alexandru Caranica, H. Cucu, Andi Buzo","doi":"10.1109/CBMI.2016.7500262","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500262","url":null,"abstract":"With the increasing availability of spoken documents in different languages, there is a need of systems performing automatic and unsupervised search on audio streams, containing speech, in a document retrieval scenario. We are interested in retrieving information from multilingual speech data, from spoken documents such as broadcast news, video archives or even telephone conversations. The ultimate goal of a Spoken Document Retrieval System is to enable vocabulary-independent search over large collections of speech content, to find written or spoken “queries” or reoccurring speech data. If the language is known, the task is relatively simple. One could use a large vocabulary continuous speech recognition (LVCSR) tool to produce highly accurate word transcripts, which are then indexed and query terms are retrieved from the index. However, if the language is unknown, hence queries are not part of the recognizers vocabulary, the relevant audio documents cannot be retrieved. Thus, search metrics are affected, and documents retrieved are no longer relevant to the user. In this paper we investigate whether the use of input features derived from multi-language resources helps the process of unsupervised spoken term detection, independent of the language. Moreover, we explore the use of multi objective search, by combining both language detection and LVCSR based search, with unsupervised Spoken Term Detection (STD). In order to achieve this, we make use of multiple open-source tools and in-house acoustic and language models, to propose a language independent spoken document retrieval system.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130413228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Indexing Ensembles of Exemplar-SVMs with rejecting taxonomies 具有拒绝分类法的范例支持向量机的索引集成

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500241

Federico Becattini, Lorenzo Seidenari, A. Bimbo

引用次数: 1

Deep learning vs spectral clustering into an active clustering with pairwise constraints propagation 深度学习与光谱聚类的对比，形成具有两两约束传播的主动聚类

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500237

Nicolas Voiron, A. Benoît, P. Lambert, B. Ionescu

{"title":"Deep learning vs spectral clustering into an active clustering with pairwise constraints propagation","authors":"Nicolas Voiron, A. Benoît, P. Lambert, B. Ionescu","doi":"10.1109/CBMI.2016.7500237","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500237","url":null,"abstract":"In our data driven world, categorization is of major importance to help end-users and decision makers understanding information structures. Supervised learning techniques rely on annotated samples that are often difficult to obtain and training often overfits. On the other hand, unsupervised clustering techniques study the structure of the data without disposing of any training data. Given the difficulty of the task, supervised learning often outperforms unsupervised learning. A compromise is to use a partial knowledge, selected in a smart way, in order to boost performance while minimizing learning costs, what is called semi-supervised learning. In such use case, Spectral Clustering proved to be an efficient method. Also, Deep Learning outperformed several state of the art classification approaches and it is interesting to test it in our context. In this paper, we firstly introduce the concept of Deep Learning into an active semi-supervised clustering process and compare it with Spectral Clustering. Secondly, we introduce constraint propagation and demonstrate how it maximizes partitioning quality while reducing annotation costs. Experimental validation is conducted on two different real datasets. Results show the potential of the clustering methods.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126672093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Comparing and combining unimodal methods for multimodal recognition 多模态识别中单模态方法的比较与结合

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-01 DOI: 10.1109/CBMI.2016.7500253

S. Ishikawa, Jorma T. Laaksonen

{"title":"Comparing and combining unimodal methods for multimodal recognition","authors":"S. Ishikawa, Jorma T. Laaksonen","doi":"10.1109/CBMI.2016.7500253","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500253","url":null,"abstract":"Multimodal recognition has recently become more attractive and common method in multimedia information retrieval. In many cases it shows better recognition results than using only unimodal methods. Most of current multimodal recognition methods still depend on unimodal recognition results. Therefore, in order to get better recognition performance, it is important to choose suitable features and classification models for each unimodal recognition task. In this paper, we research several unimodal recognition methods, features for them and their combination techniques, in the application setup of concept detection in image-text data. For image features, we use GoogLeNet deep convolutional neural network (DCNN) activation features and semantic concept vectors. For text features, we use simple binary vectors for tags and word2vec vectors. As the concept detection model, we apply the Multimodal Deep Boltzmann Machine (DBM) model and the Support Vector Machine (SVM) with the linear homogeneous kernel map and the non-linear radial basis function (RBF) kernel. The experimental results with the MIRFLICKR-1M data set show that the Multimodal DBM or the non-linear SVM approaches produce equally good results within the margins of statistical variation.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123918141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1