Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing最新文献

筛选
英文 中文
A system for assisted transcription and annotation of ancient documents 辅助抄写和注释古代文献的系统
María José Castro Bleda, J. M. Vilar, D. Llorens, A. Marzal, F. Prat, Francisco Zamora-Martínez
{"title":"A system for assisted transcription and annotation of ancient documents","authors":"María José Castro Bleda, J. M. Vilar, D. Llorens, A. Marzal, F. Prat, Francisco Zamora-Martínez","doi":"10.1145/3095713.3095752","DOIUrl":"https://doi.org/10.1145/3095713.3095752","url":null,"abstract":"Computer assisted transcription tools can speed up the process of reading and transcribing texts. At the same time, new annotation tools open new ways of accessing the text in its graphical form. STATE, an assisted transcription system for ancient documents, offers a multimodal interaction environment to assist humans in transcribing documents: the user can type, write on the screen or utter a word. When one of these actions is used to correct an erroneous word, the system uses this new information to look for other mistakes. The system is modular: creation of projects from a set of images of documents, an automatic transcription system, and user interaction with the transcriptions to easily correct them as needed. This division of labor allows great flexibility for organizing the work in a team of transcribers. Our immediate goals are to improve the recognition system and to enrich the obtained transcriptions with scholarly descriptions.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122341074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Cartoon Colorization Based on Convolutional Neural Network 基于卷积神经网络的卡通自动上色
D. Varga, C. Szabó, T. Szirányi
{"title":"Automatic Cartoon Colorization Based on Convolutional Neural Network","authors":"D. Varga, C. Szabó, T. Szirányi","doi":"10.1145/3095713.3095742","DOIUrl":"https://doi.org/10.1145/3095713.3095742","url":null,"abstract":"This paper deals with automatic cartoon colorization. This is a hard issue, since it is an ill-posed problem that usually requires user intervention to achieve high quality. Motivated by the recent successes in natural image colorization based on deep learning techniques, we investigate the colorization problem at the cartoon domain using Convolutional Neural Network. To our best knowledge, no existing papers or research studies address this problem using deep learning techniques. Here we investigate a deep Convolutional Neural Network based automatic color filling method for cartoons.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125334383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Prediction of User Demographics from Music Listening Habits 从音乐收听习惯预测用户人口统计
Thomas Krismayer, M. Schedl, Peter Knees, Rick Rabiser
{"title":"Prediction of User Demographics from Music Listening Habits","authors":"Thomas Krismayer, M. Schedl, Peter Knees, Rick Rabiser","doi":"10.1145/3095713.3095722","DOIUrl":"https://doi.org/10.1145/3095713.3095722","url":null,"abstract":"Online activities such as social networking, shopping, and consuming multi-media create digital traces often used to improve user experience and increase revenue, e.g., through better-fitting recommendations and targeted marketing. We investigate to which extent the music listening habits of users of the social music platform Last.fm can be used to predict their age, gender, and nationality. We propose a TF-IDF-like feature modeling approach for artist listening information and artist tags combined with additionally extracted features. We show that we can substantially outperform a baseline majority voting approach and can compete with existing approaches. Further, regarding prediction accuracy vs. available listening data we show that even one single listening event per user is enough to outperform the baseline in all prediction tasks. We conclude that personal information can be derived from music listening information, which indeed can help better tailoring recommendations.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"06 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116138954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Improving Hierarchical Image Classification with Merged CNN Architectures 用合并CNN架构改进分层图像分类
Anuvabh Dutt, D. Pellerin, G. Quénot
{"title":"Improving Hierarchical Image Classification with Merged CNN Architectures","authors":"Anuvabh Dutt, D. Pellerin, G. Quénot","doi":"10.1145/3095713.3095745","DOIUrl":"https://doi.org/10.1145/3095713.3095745","url":null,"abstract":"We consider the problem of image classification using deep convolutional networks, with respect to hierarchical relationships among classes. We investigate if the semantic hierarchy is captured by CNN models or not. For this we analyze the confidence of the model for a category and its sub-categories. Based on the results, we propose an algorithm for improving the model performance at test time by adapting the classifier to each test sample and without any re-training. Secondly, we propose a strategy for merging models for jointly learning two levels of hierarchy. This reduces the total training time as compared to training models separately, and also gives improved classification performance.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126809995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
CoMo: A Compact Composite Moment-Based Descriptor for Image Retrieval CoMo:一种用于图像检索的紧凑复合矩描述符
S. A. Vassou, N. Anagnostopoulos, A. Amanatiadis, Klitos Christodoulou, S. Chatzichristofis
{"title":"CoMo: A Compact Composite Moment-Based Descriptor for Image Retrieval","authors":"S. A. Vassou, N. Anagnostopoulos, A. Amanatiadis, Klitos Christodoulou, S. Chatzichristofis","doi":"10.1145/3095713.3095744","DOIUrl":"https://doi.org/10.1145/3095713.3095744","url":null,"abstract":"Low level features play a vital role in image retrieval. Image moments can effectively represent global information of image content while being invariant under translation, rotation, and scaling. This paper briefly presents a moment based composite and compact low-level descriptor for image retrieval. In order to test the proposed feature, the authors employ the Bag-of-Visual-Words representation to perform experiments on two well-known benchmarking image databases. The robust and highly competitive retrieval performances, reported in all tested diverse collections, verify the promising potential that the proposed descriptor introduces.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121308870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
NeuralStory: an Interactive Multimedia System for Video Indexing and Re-use NeuralStory:一个用于视频索引和再利用的交互式多媒体系统
L. Baraldi, C. Grana, R. Cucchiara
{"title":"NeuralStory: an Interactive Multimedia System for Video Indexing and Re-use","authors":"L. Baraldi, C. Grana, R. Cucchiara","doi":"10.1145/3095713.3095735","DOIUrl":"https://doi.org/10.1145/3095713.3095735","url":null,"abstract":"In the last years video has been swamping the Internet: websites, social networks, and business multimedia systems are adopting video as the most important form of communication and information. Video are normally accessed as a whole and are not indexed in the visual content. Thus, they are often uploaded as short, manually cut clips with user-provided annotations, keywords and tags for retrieval. In this paper, we propose a prototype multimedia system which addresses these two limitations: it overcomes the need of human intervention in the video setting, thanks to fully deep learning-based solutions, and decomposes the storytelling structure of the video into coherent parts. These parts can be shots, key-frames, scenes and semantically related stories, and are exploited to provide an automatic annotation of the visual content, so that parts of video can be easily retrieved. This also allows a principled re-use of the video itself: users of the platform can indeed produce new storytelling by means of multi-modal presentations, add text and other media, and propose a different visual organization of the content. We present the overall solution, and some experiments on the re-use capability of our platform in edutainment by conducting an extensive user valuation","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122163347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Separating the Wheat from the Chaff: Events Detection in Twitter Data 从谷壳中分离小麦:Twitter数据中的事件检测
Andrea Ferracani, Daniele Pezzatini, Lea Landucci, Giuseppe Becchi, A. Bimbo
{"title":"Separating the Wheat from the Chaff: Events Detection in Twitter Data","authors":"Andrea Ferracani, Daniele Pezzatini, Lea Landucci, Giuseppe Becchi, A. Bimbo","doi":"10.1145/3095713.3095728","DOIUrl":"https://doi.org/10.1145/3095713.3095728","url":null,"abstract":"In this paper we present a system for the detection and validation of macro and micro-events in cities (e.g. concerts, business meetings, car accidents) through the analysis of geolocalized messages from Twitter. A simple but effective method is proposed for unknown event detection designed to alleviate computational issues in traditional approaches. The method is exploited by a web interface that in addition to visualizing the results of the automatic computation exposes interactive tools to inspect, validate the data and refine the processing pipeline. Researchers can exploit the web application for the rapid creation of macro and micro-events datasets of geolocalized messages currently unavailable and needed to improve supervised and unsupervised events classification on Twitter. The system has been evaluated in terms of precision.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131015453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A free Web API for single and multi-document summarization 一个免费的Web API,用于单个和多个文档摘要
Massimo Mauro, Sergio Benini, N. Adami, A. Signoroni, R. Leonardi, Luca Canini
{"title":"A free Web API for single and multi-document summarization","authors":"Massimo Mauro, Sergio Benini, N. Adami, A. Signoroni, R. Leonardi, Luca Canini","doi":"10.1145/3095713.3095738","DOIUrl":"https://doi.org/10.1145/3095713.3095738","url":null,"abstract":"In this work we present a free Web API for single and multi-text summarization. The summarization algorithm follows an extractive approach, thus selecting the most relevant sentences from a single document or a document set. It integrates in a novel pipeline different text analysis techniques - ranging from keyword and entity extraction, to topic modelling and sentence clustering - and gives SoA competitive results. The application, written in Python, supports as input both plain texts and Web URLs. The API is publicly accessible for free using the specific conference token1 as described in the reference page2. The browser-based demo version, for summarization of single documents only, is publicly accessible at http://yonderlabs.com/demo.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115066742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bangladeshi Number Plate Detection: Cascade Learning vs. Deep Learning 孟加拉国车牌检测:级联学习与深度学习
M. Pias, Aunnoy K. Mutasim, M. Amin
{"title":"Bangladeshi Number Plate Detection: Cascade Learning vs. Deep Learning","authors":"M. Pias, Aunnoy K. Mutasim, M. Amin","doi":"10.1145/3095713.3095727","DOIUrl":"https://doi.org/10.1145/3095713.3095727","url":null,"abstract":"This work investigated two different machine learning techniques: Cascade Learning and Deep Learning, to find out which algorithm performs better to detect the number plate of vehicles registered in Bangladesh. To do this, we created a dataset of about 1000 images collected from a security camera of Independent University, Bangladesh. Each image in the dataset were then labelled manually by selecting the Region of Interest (ROI). In the Cascade Learning approach, a sliding window technique was used to detect objects. Then a cascade classifier was employed to determine if the window contained object of interest or not. In the Deep Learning approach, CIFAR-10 dataset was used to pre-train a 15-layer Convolutional Neural Network (CNN). Using this pretrained CNN, a Regions with CNN (R-CNN) was then trained using our dataset. We found that the Deep Learning approach (maximum accuracy 99.60% using 566 training images) outperforms the detector constructed using Cascade classifiers (maximum accuracy 59.52% using 566 positive and 1022 negative training images) for 252 test images.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127337448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Visualizing weakly-Annotated Multi-label Mayan Inscriptions with Supervised t-SNE 基于监督t-SNE的弱标注多标签玛雅铭文可视化
E. Román-Rangel, S. Marchand-Maillet
{"title":"Visualizing weakly-Annotated Multi-label Mayan Inscriptions with Supervised t-SNE","authors":"E. Román-Rangel, S. Marchand-Maillet","doi":"10.1145/3095713.3095720","DOIUrl":"https://doi.org/10.1145/3095713.3095720","url":null,"abstract":"We present a supervised dimensionality reduction technique suitable for visualizing multi-label images on a 2-D space. This method extends the use of the well-known t-distributed stochastic embedding (t-SNE) algorithm to the case of multi-labels instances, where the concept of partial relevance plays an important role. Furthermore, it is applicable straightaway for weakly annotated data. We apply our approach to generate 2-D representations of Mayan glyph-blocks, which are groups of individual glyph-signs expressing full sentences. The resulting representations are used to place visual instances in a 2-D space with the purpose of providing a browsable catalog for further epigraphic studies, where nearby instances are similar both in semantic and visual terms. We evaluate the performance of our approach quantitatively by performing classification and retrieval experiments. Our results show that this approach obtains high performance in both of these tasks.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115521032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信