M. Lahiri, Chayant Tantipathananandh, Rosemary Warungu, D. Rubenstein, T. Berger-Wolf
{"title":"Biometric animal databases from field photographs: identification of individual zebra in the wild","authors":"M. Lahiri, Chayant Tantipathananandh, Rosemary Warungu, D. Rubenstein, T. Berger-Wolf","doi":"10.1145/1991996.1992002","DOIUrl":"https://doi.org/10.1145/1991996.1992002","url":null,"abstract":"We describe an algorithmic and experimental approach to a fundamental problem in field ecology: computer-assisted individual animal identification. We use a database of noisy photographs taken in the wild to build a biometric database of individual animals differentiated by their coat markings. A new image of an unknown animal can then be queried by its coat markings against the database to determine if the animal has been observed and identified before. Our algorithm, called StripeCodes, efficiently extracts simple image features and uses a dynamic programming algorithm to compare images. We test its accuracy against two different classes of methods: Eigenface, which is based on algebraic techniques, and matching multi-scale histograms of differential image features, an approach from signal processing. StripeCodes performs better than all competing methods for our dataset, and scales well with database size.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130427386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large scale visual-based event matching","authors":"Mohamed Riadh Trad, A. Joly, N. Boujemaa","doi":"10.1145/1991996.1992049","DOIUrl":"https://doi.org/10.1145/1991996.1992049","url":null,"abstract":"Organizing media according to real-life events is attracting interest in the multimedia community. Event-centric indexing approaches are very promising for discovering more complex relationships between data. In this paper we introduce a new visual-based method for retrieving events in photo collections, typically in the context of User Generated Contents. Given a query event record, represented by a set of photos, our method aims to retrieve other records of the same event, typically generated by distinct users. Similarly to what is done in state-of-the-art object retrieval systems, we propose a two-stage strategy combining an efficient visual indexing model with a spatiotemporal verification re-ranking stage to improve query performance. For efficiency and scalability concerns, we implemented the proposed method according to the MapReduce programming model using Multi-Probe Locality Sensitive Hashing. Experiments were conducted on LastFM-Flickr dataset for distinct scenarios, including event retrieval, automatic annotation and tags suggestion. As one result, our method is able to suggest the correct event tag over 5 suggestions with a 72% success rate.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125935502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stefan Romberg, Lluis Garcia Pueyo, R. Lienhart, R. V. Zwol
{"title":"Scalable logo recognition in real-world images","authors":"Stefan Romberg, Lluis Garcia Pueyo, R. Lienhart, R. V. Zwol","doi":"10.1145/1991996.1992021","DOIUrl":"https://doi.org/10.1145/1991996.1992021","url":null,"abstract":"In this paper we propose a highly effective and scalable framework for recognizing logos in images. At the core of our approach lays a method for encoding and indexing the relative spatial layout of local features detected in the logo images. Based on the analysis of the local features and the composition of basic spatial structures, such as edges and triangles, we can derive a quantized representation of the regions in the logos and minimize the false positive detections. Furthermore, we propose a cascaded index for scalable multi-class recognition of logos. For the evaluation of our system, we have constructed and released a logo recognition benchmark which consists of manually labeled logo images, complemented with non-logo images, all posted on Flickr. The dataset consists of a training, validation, and test set with 32 logo-classes. We thoroughly evaluate our system with this benchmark and show that our approach effectively recognizes different logo classes with high precision.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129038267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comprehensive neural-based approach for text recognition in videos using natural language processing","authors":"Khaoula Elagouni, Christophe Garcia, P. Sébillot","doi":"10.1145/1991996.1992019","DOIUrl":"https://doi.org/10.1145/1991996.1992019","url":null,"abstract":"This work aims at helping multimedia content understanding by deriving benefit from textual clues embedded in digital videos. For this, we developed a complete video Optical Character Recognition system (OCR), specifically adapted to detect and recognize embedded texts in videos. Based on a neural approach, this new method outperforms related work, especially in terms of robustness to style and size variabilities, to background complexity and to low resolution of the image. A language model that drives several steps of the video OCR is also introduced in order to remove ambiguities due to a local letter by letter recognition and to reduce segmentation errors. This approach has been evaluated on a database of French TV news videos and achieves an outstanding character recognition rate of 95%, corresponding to 78% of words correctly recognized, which enables its incorporation into an automatic video indexing and retrieval system.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115219182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","authors":"","doi":"10.1145/1991996","DOIUrl":"https://doi.org/10.1145/1991996","url":null,"abstract":"","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121534011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}