{"title":"Quadruplet Networks for Sketch-Based Image Retrieval","authors":"Omar Seddati, S. Dupont, S. Mahmoudi","doi":"10.1145/3078971.3078985","DOIUrl":"https://doi.org/10.1145/3078971.3078985","url":null,"abstract":"Freehand sketches are a simple and powerful tool for communication. They are easily recognized across cultures and suitable for various applications. In this paper, we use deep convolutional neural networks (ConvNets) to address sketch-based image retrieval (SBIR). We first train our ConvNets on sketch and image object recognition in a large scale benchmark for SBIR (the sketchy database). We then conduct a comprehensive study of ConvNets features for SBIR, using a kNN similarity search paradigm in the ConvNet feature space. In contrast to recent SBIR works, we propose a new architecture the quadruplet networks which enhance ConvNet features for SBIR. This new architecture enables ConvNets to extract more robust global and local features. We evaluate our approach on three large scale datasets. Our quadruplet networks outperform previous state-of-the-art on two of them by a significant margin and gives competitive results on the third. Our system achieves a recall of 42.16% (at k=1) for the sketchy database (more than 5% improvement), a Kendal score of 43.28Τb on the TU-Berlin SBIR benchmark (close to 6Τb improvement) and a mean average precision (MAP) of 32.16% on Flickr15k (a category level SBIR benchmark).","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114382483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging Semantic Facets for Adaptive Ranking of Social Comments","authors":"Elaheh Momeni, Reza Rawassizadeh, Eytan Adar","doi":"10.1145/3078971.3079004","DOIUrl":"https://doi.org/10.1145/3078971.3079004","url":null,"abstract":"An essential part of the social media ecosystem is user-generated comments. However, not all comments are useful to all people as both authors of comments and readers have different intentions and perspectives. Consequently, the development of automated approaches for the ranking of comments and the optimization of viewers' interaction experiences are becoming increasingly important. This work proposes an adaptive faceted ranking framework which enriches comments along multiple semantic facets (e.g., subjectivity, informativeness, and topics), thus enabling users to explore different facets and select combinations of facets in order to extract and rank comments that match their interests. A prototype implementation of the framework has been developed which allows us to evaluate different ranking strategies of the proposed framework. We find that adaptive faceted ranking shows significant improvements over prevalent ranking methods which are utilized by many platforms such as YouTube or The Economist. We observe substantial improvements in user experience when enriching each element of a comment along multiple explicit semantic facets rather than in a single topic or subjective facets.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134578702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Oral Session: Doctoral Symposium","authors":"T. Piatrik","doi":"10.1145/3254631","DOIUrl":"https://doi.org/10.1145/3254631","url":null,"abstract":"","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127426236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Effectiveness of Distance Measures for Similarity Search in Multi-Variate Sensory Data: Effectiveness of Distance Measures for Similarity Search","authors":"Yash Garg, S. Poccia","doi":"10.1145/3078971.3079009","DOIUrl":"https://doi.org/10.1145/3078971.3079009","url":null,"abstract":"Integration of rich sensor technologies with everyday applications, such as gesture recognition and health monitoring, has raised the importance of the ability to effectively search and analyze multi-variate time series data. Consequently, various time series distance measures (such as Euclidean distance, edit distance, and dynamic time warping) have been extended from uni-variate to multi-variate time series. In this paper, we note that the naive extensions of these measures may not necessarily be effective when analyzing multi-variate time series data. We present several algorithms, some of which leverage external metadata describing the potential relationships, either learned from the data or captured from the metadata, among the variates. We then experimentally study the effectiveness of multi-variate time series distance measures against human motion data sets.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128207241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classification of sMRI for Alzheimer's disease Diagnosis with CNN: Single Siamese Networks with 2D+? Approach and Fusion on ADNI","authors":"Karim Aderghal, J. Benois-Pineau, K. Afdel","doi":"10.1145/3078971.3079010","DOIUrl":"https://doi.org/10.1145/3078971.3079010","url":null,"abstract":"The methods of Content-Based visual information indexing and retrieval penetrate into Healthcare and become popular in Computer-Aided Diagnostics. The PhD research we have started 13 months ago is devoted to the multimodal classification of MRI brain scans for Alzheimer Disease diagnostics. We use the winner classifier, such as CNN. We first proposed an original 2D+ approach. It avoids heavy volumetric computations and uses domain knowledge on Alzheimer biomarkers. We study discriminative power of different brain projections. Three binary classification tasks are considered separating Alzheimer Disease (AD) patients from Mild Cognitive Impairment (MCI) and Normal Control subject (NC). Two fusion methods on FC layer and on the single-projection CNN output show better performances, up to 91% of accuracy is achieved. The results are competitive with the SOA which uses heavier algorithmic chain.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125564374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Ceroni, V. Solachidis, C. Niederée, Olga Papadopoulou, V. Mezaris
{"title":"Expo: An Expectation-oriented System for Selecting Important Photos from Personal Collections","authors":"Andrea Ceroni, V. Solachidis, C. Niederée, Olga Papadopoulou, V. Mezaris","doi":"10.1145/3078971.3079011","DOIUrl":"https://doi.org/10.1145/3078971.3079011","url":null,"abstract":"The diffusion of digital photography lets people take hundreds of photos during personal events, such as trips and ceremonies. Many methods have been developed for summarizing such large personal photo collections. However, they usually emphasize the coverage of the original collection, without considering which photos users would select, i.e. their expectations. In this paper we present Expo, a system that aims at selecting which photos users perceive as most important and would have selected, thus meeting their expectations. It does not rely on any manually provided annotation, thus keeping the effort of users low. Photos are processed by applying a wide set of image processing techniques and a subset of a required size is selected. Users can review and modify the automatic selection. The system can also be used to gather training data by letting users select their preferred photos from the imported collections.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123963914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable Hadoop-Based Pooled Time Series of Big Video Data from the Deep Web","authors":"C. Mattmann, M. Sharan","doi":"10.1145/3078971.3079019","DOIUrl":"https://doi.org/10.1145/3078971.3079019","url":null,"abstract":"We contribute a scalable, open source implementation of the Pooled Time Series (PoT) algorithm from CVPR 2015. The algorithm is evaluated on approximately 6800 human trafficking (HT) videos collected from the deep and dark web, and on an open dataset: the Human Motion Database (HMDB). We describe PoT and our motivation for using it on larger data and the issues we encountered. Our new solution reimagines PoT as an Apache Hadoop-based algorithm. We demonstrate that our new Hadoop-based algorithm successfully identifies similar videos in the HT and HMDB datasets and we evaluate the algorithm qualitatively and quantitatively.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131181514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Oral Session 1: Vision and Language (Spotlight Presentations)","authors":"H. Cucu","doi":"10.1145/3254616","DOIUrl":"https://doi.org/10.1145/3254616","url":null,"abstract":"","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131489158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Oral Session 4: Cross-media Retrieval (Oral presentations)","authors":"Giorgos Tolias","doi":"10.1145/3254626","DOIUrl":"https://doi.org/10.1145/3254626","url":null,"abstract":"","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123000380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dayan Wu, Zheng Lin, Bo Li, Mingzhen Ye, Weiping Wang
{"title":"Deep Supervised Hashing for Multi-Label and Large-Scale Image Retrieval","authors":"Dayan Wu, Zheng Lin, Bo Li, Mingzhen Ye, Weiping Wang","doi":"10.1145/3078971.3078989","DOIUrl":"https://doi.org/10.1145/3078971.3078989","url":null,"abstract":"One of the most challenging tasks in large-scale multi-label image retrieval is to map images into binary codes while preserving multilevel semantic similarity. Recently, several deep supervised hashing methods have been proposed to learn hash functions that preserve multilevel semantic similarity with deep convolutional neural networks. However, these triplet label based methods try to preserve the ranking order of images according to their similarity degrees to the queries while not putting direct constraints on the distance between the codes of very similar images. Besides, the current evaluation criteria are not able to measure the performance of existing hashing methods on preserving fine-grained multilevel semantic similarity. To tackle these issues, we propose a novel Deep Multilevel Semantic Similarity Preserving Hashing (DMSSPH) method to learn compact similarity-preserving binary codes for the huge body of multi-label image data with deep convolutional neural networks. In our approach, we make the best of the supervised information in the form of pairwise labels to maximize the discriminability of output binary codes. Extensive evaluations conducted on several benchmark datasets demonstrate that the proposed method significantly outperforms the state-of-the-art supervised and unsupervised hashing methods at the accuracies of top returned images, especially for shorter binary codes. Meanwhile, the proposed method shows better performance on preserving fine-grained multilevel semantic similarity according to the results under the Jaccard coefficient based evaluation criteria we propose.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116265271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}