2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)最新文献

筛选
英文 中文
Investigating segment-based query expansion for user-generated spoken content retrieval 为用户生成的口语内容检索研究基于段的查询扩展
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500268
Ahmad Khwileh, G. Jones
{"title":"Investigating segment-based query expansion for user-generated spoken content retrieval","authors":"Ahmad Khwileh, G. Jones","doi":"10.1109/CBMI.2016.7500268","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500268","url":null,"abstract":"The very rapid growth in user-generated social multimedia content on online platforms is creating new challenges for search technologies. A significant issue for search of this type of content is its highly variable form and quality. This is compounded by the standard information retrieval (IR) problem of mismatch between search queries and target items. Query Expansion (QE) has been shown to be an effect technique to improve IR effectiveness for multiple search tasks. In QE, words from a number of relevant or assumed relevant top ranked documents from an initial search are added to the initial search query to enrich it before carrying out a further search operation. In this work, we investigate the application of QE methods for searching social multimedia content. In particular we focus on social multimedia content where the information is primarily in the audio stream. To address the challenge of content variability, we introduce three speech segment-based methods for QE using: Semantic segmentation, Discourse segmentation and Window-Based. Our experimental investigation illustrates the superiority of these segment-based methods in comparison to a standard full document QE method for a version of the MediaEval 2012 Search task newly extended as an adhoc search task.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123233086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Indexing multimedia learning materials in ultimate course search 在终极课程检索中索引多媒体学习资料
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500250
Sheetal Rajgure, Krithika Raghavan, Vincent Oria, Reza Curtmola, Edina Renfro-Michel, P. Gouton
{"title":"Indexing multimedia learning materials in ultimate course search","authors":"Sheetal Rajgure, Krithika Raghavan, Vincent Oria, Reza Curtmola, Edina Renfro-Michel, P. Gouton","doi":"10.1109/CBMI.2016.7500250","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500250","url":null,"abstract":"Multimedia is the main support for online learning materials and the size of multimedia learning materials is growing with the popularity of online programs offered by Universities. Ultimate Course Search (UCS) is a tool that aims to provide efficient search of course materials. UCS integrates slides, lecture videos and textbook content into a single platform with search capabilities. The keywords extracted from the textbook index and the power-point slides are the basis of the indexing scheme. The slides are indexed on the keywords and the videos are indexed on the slides. The correspondence between the slides and video segments is established using the meta-data provided by the video recording software when available and by image processing techniques. Unlike a classical document search in which the user would be looking where the keywords are found, the search of learning materials in UCS is different because the user is also looking where the search words are better explained. We propose a keyword appearance prioritized ranking mechanism that integrates into the ranking, the location information of the keyword from the slides.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128339689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Model-based video content representation 基于模型的视频内容表示
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500254
Lukas Diem, M. Zaharieva
{"title":"Model-based video content representation","authors":"Lukas Diem, M. Zaharieva","doi":"10.1109/CBMI.2016.7500254","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500254","url":null,"abstract":"Recurring visual elements in videos commonly represent central content entities, such as main characters and dominant objects. The automated detection of such elements is crucial for various application fields ranging from compact video content summarization to the retrieval of videos sharing common visual entities. Recent approaches for content-based video analysis commonly require for prior knowledge about the appearance of potential objects of interest or build upon a specific assumption, such as the presence of a particular camera view, object motion, or a reference set to estimate the appearance of an object. In this paper, we propose an unsupervised, model-based approach for the detection of recurring visual elements in a video sequence. Detected elements do not necessarily represent an object, yet, they allow for visual and semantic interpretation. The experimental evaluation of detected models across different videos demonstrate the ability of the models to capture potentially high diversity in the visual appearance of the traced elements.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128266924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Filterbank coefficients selection for segmentation in singer turns 歌手圈分割的滤波器组系数选择
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500273
Marwa Thlithi, J. Pinquier, Thomas Pellegrini, R. André-Obrecht
{"title":"Filterbank coefficients selection for segmentation in singer turns","authors":"Marwa Thlithi, J. Pinquier, Thomas Pellegrini, R. André-Obrecht","doi":"10.1109/CBMI.2016.7500273","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500273","url":null,"abstract":"Audio segmentation is often the first step of audio indexing systems. It provides segments supposed to be acoustically homogeneous. In this paper, we report our recent experiments on segmenting music recordings into singer turns, by analogy with speaker turns in speech processing. We compare several acoustic features for this task: FilterBANK coefficients (FBANK), and Mel frequency cepstral coefficients (MFCC). FBANK features were shown to outperform MFCC on a “clean” singing corpus. We describe a coefficient selection method that allowed further improvement on this corpus. A 75.8% F-measure was obtained with FBANK features selected with this method, corresponding to a 30.6% absolute gain compared to MFCC. On another corpus comprised of ethno-musicological recordings, both feature types showed a similar performance of about 60%. This corpus presents an increased difficulty due to the presence of instruments overlapped with singing and to a lower recording audio quality.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134277752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large scale content-based video retrieval with LIvRE 使用LIvRE进行大规模基于内容的视频检索
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500266
Gabriel de Oliveira Barra, M. Lux, Xavier Giró-i-Nieto
{"title":"Large scale content-based video retrieval with LIvRE","authors":"Gabriel de Oliveira Barra, M. Lux, Xavier Giró-i-Nieto","doi":"10.1109/CBMI.2016.7500266","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500266","url":null,"abstract":"The fast growth of video data requires robust, efficient, and scalable systems to allow for indexing and retrieval. These systems must be accessible from lightweight, portable and usable interfaces to help users in management and search of video content. This demo paper presents LIvRE, an extension of an existing open source tool for image retrieval to support video indexing. LIvRE consists of three main system components (pre-processing, indexing and retrieval), as well as a scalable and responsive HTML5 user interface accessible from a web browser. LIvRE supports image-based queries, which are efficiently matched with the extracted frames of the indexed videos.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122582674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A hybrid graph-based and non-linear late fusion approach for multimedia retrieval 一种基于图形和非线性的多媒体检索后期融合方法
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500252
Ilias Gialampoukidis, A. Moumtzidou, Dimitris Liparas, S. Vrochidis, Y. Kompatsiaris
{"title":"A hybrid graph-based and non-linear late fusion approach for multimedia retrieval","authors":"Ilias Gialampoukidis, A. Moumtzidou, Dimitris Liparas, S. Vrochidis, Y. Kompatsiaris","doi":"10.1109/CBMI.2016.7500252","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500252","url":null,"abstract":"Nowadays, multimedia retrieval has become a task of high importance, due to the need for efficient and fast access to very large and heterogeneous multimedia collections. An interesting challenge within the aforementioned task is the efficient combination of different modalities in a multimedia object and especially the fusion between textual and visual information. The fusion of multiple modalities for retrieval in an unsupervised way has been mostly based on early, weighted linear, graph-based and diffusion-based techniques. In contrast, we present a strategy for fusing textual and visual modalities, through the combination of a non-linear fusion model and a graph-based late fusion approach. The fusion strategy is based on the construction of a uniform multimodal contextual similarity matrix and the non-linear combination of relevance scores from query-based similarity vectors. The proposed late fusion approach is evaluated in the multimedia retrieval task, by applying it to two multimedia collections, namely the WIKI11 and IAPR-TC12. The experimental results indicate its superiority over the baseline method in terms of Mean Average Precision for both considered datasets.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128984766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Exploring an unsupervised, language independent, spoken document retrieval system 探索无监督、语言独立、口语文档检索系统
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500262
Alexandru Caranica, H. Cucu, Andi Buzo
{"title":"Exploring an unsupervised, language independent, spoken document retrieval system","authors":"Alexandru Caranica, H. Cucu, Andi Buzo","doi":"10.1109/CBMI.2016.7500262","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500262","url":null,"abstract":"With the increasing availability of spoken documents in different languages, there is a need of systems performing automatic and unsupervised search on audio streams, containing speech, in a document retrieval scenario. We are interested in retrieving information from multilingual speech data, from spoken documents such as broadcast news, video archives or even telephone conversations. The ultimate goal of a Spoken Document Retrieval System is to enable vocabulary-independent search over large collections of speech content, to find written or spoken “queries” or reoccurring speech data. If the language is known, the task is relatively simple. One could use a large vocabulary continuous speech recognition (LVCSR) tool to produce highly accurate word transcripts, which are then indexed and query terms are retrieved from the index. However, if the language is unknown, hence queries are not part of the recognizers vocabulary, the relevant audio documents cannot be retrieved. Thus, search metrics are affected, and documents retrieved are no longer relevant to the user. In this paper we investigate whether the use of input features derived from multi-language resources helps the process of unsupervised spoken term detection, independent of the language. Moreover, we explore the use of multi objective search, by combining both language detection and LVCSR based search, with unsupervised Spoken Term Detection (STD). In order to achieve this, we make use of multiple open-source tools and in-house acoustic and language models, to propose a language independent spoken document retrieval system.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130413228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Indexing Ensembles of Exemplar-SVMs with rejecting taxonomies 具有拒绝分类法的范例支持向量机的索引集成
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500241
Federico Becattini, Lorenzo Seidenari, A. Bimbo
{"title":"Indexing Ensembles of Exemplar-SVMs with rejecting taxonomies","authors":"Federico Becattini, Lorenzo Seidenari, A. Bimbo","doi":"10.1109/CBMI.2016.7500241","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500241","url":null,"abstract":"Ensembles of Exemplar-SVMs have been used for a wide variety of tasks, such as object detection, segmentation, label transfer and mid-level feature learning. In order to make this technique effective though a large collection of classifiers is needed, which often makes the evaluation phase prohibitive. To overcome this issue we exploit the joint distribution of exemplar classifier scores to build a taxonomy capable of indexing each Exemplar-SVM and enabling a fast evaluation of the whole ensemble. We experiment with the Pascal 2007 benchmark on the task of object detection and on a simple segmentation task, in order to verify the robustness of our indexing data structure with reference to the standard Ensemble. We also introduce a rejection strategy to discard not relevant image patches for a more efficient access to the data.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130531947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Deep learning vs spectral clustering into an active clustering with pairwise constraints propagation 深度学习与光谱聚类的对比,形成具有两两约束传播的主动聚类
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500237
Nicolas Voiron, A. Benoît, P. Lambert, B. Ionescu
{"title":"Deep learning vs spectral clustering into an active clustering with pairwise constraints propagation","authors":"Nicolas Voiron, A. Benoît, P. Lambert, B. Ionescu","doi":"10.1109/CBMI.2016.7500237","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500237","url":null,"abstract":"In our data driven world, categorization is of major importance to help end-users and decision makers understanding information structures. Supervised learning techniques rely on annotated samples that are often difficult to obtain and training often overfits. On the other hand, unsupervised clustering techniques study the structure of the data without disposing of any training data. Given the difficulty of the task, supervised learning often outperforms unsupervised learning. A compromise is to use a partial knowledge, selected in a smart way, in order to boost performance while minimizing learning costs, what is called semi-supervised learning. In such use case, Spectral Clustering proved to be an efficient method. Also, Deep Learning outperformed several state of the art classification approaches and it is interesting to test it in our context. In this paper, we firstly introduce the concept of Deep Learning into an active semi-supervised clustering process and compare it with Spectral Clustering. Secondly, we introduce constraint propagation and demonstrate how it maximizes partitioning quality while reducing annotation costs. Experimental validation is conducted on two different real datasets. Results show the potential of the clustering methods.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126672093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Comparing and combining unimodal methods for multimodal recognition 多模态识别中单模态方法的比较与结合
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI) Pub Date : 2016-06-01 DOI: 10.1109/CBMI.2016.7500253
S. Ishikawa, Jorma T. Laaksonen
{"title":"Comparing and combining unimodal methods for multimodal recognition","authors":"S. Ishikawa, Jorma T. Laaksonen","doi":"10.1109/CBMI.2016.7500253","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500253","url":null,"abstract":"Multimodal recognition has recently become more attractive and common method in multimedia information retrieval. In many cases it shows better recognition results than using only unimodal methods. Most of current multimodal recognition methods still depend on unimodal recognition results. Therefore, in order to get better recognition performance, it is important to choose suitable features and classification models for each unimodal recognition task. In this paper, we research several unimodal recognition methods, features for them and their combination techniques, in the application setup of concept detection in image-text data. For image features, we use GoogLeNet deep convolutional neural network (DCNN) activation features and semantic concept vectors. For text features, we use simple binary vectors for tags and word2vec vectors. As the concept detection model, we apply the Multimodal Deep Boltzmann Machine (DBM) model and the Support Vector Machine (SVM) with the linear homogeneous kernel map and the non-linear radial basis function (RBF) kernel. The experimental results with the MIRFLICKR-1M data set show that the Multimodal DBM or the non-linear SVM approaches produce equally good results within the margins of statistical variation.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123918141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信