{"title":"Querying for video events by semantic signatures from few examples","authors":"M. Mazloom, A. Habibian, Cees G. M. Snoek","doi":"10.1145/2502081.2502160","DOIUrl":"https://doi.org/10.1145/2502081.2502160","url":null,"abstract":"We aim to query web video for complex events using only a handful of video query examples, where the standard approach learns a ranker from hundreds of examples. We consider a semantic signature representation, consisting of off-the-shelf concept detectors, to capture the variance in semantic appearance of events. Since it is unknown what similarity metric and query fusion to use in such an event retrieval setting, we perform three experiments on unconstrained web videos from the TRECVID event detection task. It reveals that: retrieval with semantic signatures using normalized correlation as similarity metric outperforms a low-level bag-of-words alternative, multiple queries are best combined using late fusion with an average operator, and event retrieval is preferred over event classification when less than eight positive video examples are available.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84589933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time salient object detection","authors":"Chia-Ju Lu, Chih-Fan Hsu, Mei-Chen Yeh","doi":"10.1145/2502081.2502240","DOIUrl":"https://doi.org/10.1145/2502081.2502240","url":null,"abstract":"Salient object detection techniques have a variety of multimedia applications of broad interest. However, the detection must be fast to truly aid in these processes. There exist many robust algorithms tackling the salient object detection problem but most of them are computationally demanding. In this demonstration we show a fast salient object detection system implemented in a conventional PC environment. We examine the challenges faced in the design and development of a practical system that can achieve accurate detection in real-time.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80694623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Object co-segmentation via discriminative low rank matrix recovery","authors":"Yong Li, J. Liu, Zechao Li, Yang Liu, Hanqing Lu","doi":"10.1145/2502081.2502195","DOIUrl":"https://doi.org/10.1145/2502081.2502195","url":null,"abstract":"The goal of this paper is to simultaneously segment the object regions appearing in a set of images of the same object class, known as object co-segmentation. Different from typical methods, simply assuming that the regions common among images are the object regions, we additionally consider the disturbance from consistent backgrounds, and indicate not only common regions but salient ones among images to be the object regions. To this end, we propose a Discriminative Low Rank matrix Recovery (DLRR) algorithm to divide the over-completely segmented regions (i.e.,superpixels) of a given image set into object and non-object ones. In DLRR, a low-rank matrix recovery term is adopted to detect salient regions in an image, while a discriminative learning term is used to distinguish the object regions from all the super-pixels. An additional regularized term is imported to jointly measure the disagreement between the predicted saliency and the objectiveness probability corresponding to each super-pixel of the image set. For the unified learning problem by connecting the above three terms, we design an efficient optimization procedure based on block-coordinate descent. Extensive experiments are conducted on two public datasets, i.e., MSRC and iCoseg, and the comparisons with some state-of-the-arts demonstrate the effectiveness of our work.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"134 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78654334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Facilitating fashion camouflage art","authors":"Ranran Feng, B. Prabhakaran","doi":"10.1145/2502081.2502121","DOIUrl":"https://doi.org/10.1145/2502081.2502121","url":null,"abstract":"Artists and fashion designers have recently been creating a new form of art -- Camouflage Art -- which can be used to prevent computer vision algorithms from detecting faces. This digital art technique combines makeup and hair styling, or other modifications such as facial painting to help avoid automatic face-detection. In this paper, we first study the camouflage interference and its effectiveness on several current state of art techniques in face detection/recognition; and then present a tool that can facilitate digital art design for such camouflage that can fool these computer vision algorithms. This tool can find the prominent or decisive features from facial images that constitute the face being recognized; and give suggestions for camouflage options (makeup, styling, paints) on particular facial features or facial parts. Testing of this tool shows that it can effectively aid the artists or designers in creating camouflage-thwarting designs. The evaluation on suggested camouflages applied on 40 celebrities across eight different face recognition systems (both non-commercial or commercial) shows that 82.5% ~ 100% of times the subject is unrecognizable using the suggested camouflage.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78169275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring discriminative pose sub-patterns for effective action classification","authors":"Xu Zhao, Yuncai Liu, Yun Fu","doi":"10.1145/2502081.2502094","DOIUrl":"https://doi.org/10.1145/2502081.2502094","url":null,"abstract":"Articulated configuration of human body parts is an essential representation of human motion, therefore is well suited for classifying human actions. In this work, we propose a novel approach to exploring the discriminative pose sub-patterns for effective action classification. These pose sub-patterns are extracted from a predefined set of 3D poses represented by hierarchical motion angles. The basic idea is motivated by the two observations: (1) There exist representative sub-patterns in each action class, from which the action class can be easily differentiated. (2) These sub-patterns frequently appear in the action class. By constructing a connection between frequent sub-patterns and the discriminative measure, we develop the SSPI, namely, the Support Sub-Pattern Induced learning algorithm for simultaneous feature selection and feature learning. Based on the algorithm, discriminative pose sub-patterns can be identified and used as a series of \"magnetic centers\" on the surface of normalized super-sphere for feature transform. The \"attractive forces\" from the sub-patterns determine the direction and step-length of the transform. This transformation makes a feature more discriminative while maintaining dimensionality invariance. Comprehensive experimental studies conducted on a large scale motion capture dataset demonstrate the effectiveness of the proposed approach for action classification and the superior performance over the state-of-the-art techniques.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84524232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lazaros Tsochatzidis, C. Iakovidou, S. Chatzichristofis, Y. Boutalis
{"title":"Golden retriever: a Java based open source image retrieval engine","authors":"Lazaros Tsochatzidis, C. Iakovidou, S. Chatzichristofis, Y. Boutalis","doi":"10.1145/2502081.2502227","DOIUrl":"https://doi.org/10.1145/2502081.2502227","url":null,"abstract":"Golden Retriever Image Retrieval Engine (GRire) is an open source light weight Java library developed for Content Based Image Retrieval (CBIR) tasks, employing the Bag of Visual Words (BOVW) model. It provides a complete framework for creating CBIR system including image analysis tools, classifiers, weighting schemes etc., for efficient indexing and retrieval procedures. Its eminent feature is its extensibility, achieved through the open source nature of the library as well as a user-friendly embedded plug-in system. GRire is available on-line along with install and development documentation on http://www.grire.net and on its Google Code page http://code.google.com/p/grire. It is distributed either as a Java library or as a standalone Java application, both GPL licensed.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86591002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Moment feature based forensic detection of resampled digital images","authors":"Lu Li, Jianru Xue, Zhiqiang Tian, Nanning Zheng","doi":"10.1145/2502081.2502150","DOIUrl":"https://doi.org/10.1145/2502081.2502150","url":null,"abstract":"Forensic detection of resampled digital images has become an important technology among many others to establish the integrity of digital visual content. This paper proposes a moment feature based method to detect resampled digital images. Rather than concentrating on the positions of characteristic resampling peaks, we utilize a moment feature to exploit the periodic interpolation characteristics in the frequency domain. Not only the positions of resampling peaks but also the amplitude distribution is taken into consideration. With the extracted moment feature, a trained SVM classifier is used to detect resampled digital images. Extensive experimental results show the validity and efficiency of the proposed method.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"26 4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90867690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An efficient image homomorphic encryption scheme with small ciphertext expansion","authors":"Peijia Zheng, Jiwu Huang","doi":"10.1145/2502081.2502105","DOIUrl":"https://doi.org/10.1145/2502081.2502105","url":null,"abstract":"The field of image processing in the encrypted domain has been given increasing attention for the extensive potential applications, for example, providing efficient and secure solutions for privacy-preserving applications in untrusted environment. One obstacle to the widespread use of these techniques is the ciphertext expansion of high orders of magnitude caused by the existing homomorphic encryptions. In this paper, we provide a way to tackle this issue for image processing in the encrypted domain. By using characteristics of image format, we develop an image encryption scheme to limit ciphertext expansion while preserving the homomorphic property. The proposed encryption scheme first encrypts image pixels with an existing probabilistic homomorphic cryptosystem, and then compresses the whole encrypted image in order to save storage space. Our scheme has a much smaller ciphertext expansion factor compared with the element-wise encryption scheme, while preserving the homomorphic property. It is not necessary to require additional interactive protocols when applying secure signal processing tools to the compressed encrypted image. We present a fast algorithm for the encryption and the compression of the proposed image encryption scheme, which speeds up the computation and makes our scheme much more efficient. The analysis on the security, ciphertext expansion ratio, and computational complexity are also conducted. Our experiments demonstrate the validity of the proposed algorithms. The proposed scheme is suitable to be employed as an image encryption method for the applications in secure image processing.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"309 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91457979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Competitive affective gaming: winning with a smile","authors":"André Mourão, João Magalhães","doi":"10.1145/2502081.2502115","DOIUrl":"https://doi.org/10.1145/2502081.2502115","url":null,"abstract":"Human-computer interaction (HCI) is expanding towards natural modalities of human expression. Gestures, body movements and other affective interaction techniques can change the way computers interact with humans. In this paper, we propose to extend existing interaction paradigms by including facial expression as a controller in videogames. NovaEmötions is a multiplayer game where players score by acting an emotion through a facial expression. We designed an algorithm to offer an engaging interaction experience using the facial expression. Despite the novelty of the interaction method, our game scoring algorithm kept players engaged and competitive. A user study done with 46 users showed the success and potential for the usage of affective-based interaction in videogames, i.e., the facial expression as the sole controller in videogames. Moreover, we released a novel facial expression dataset with over 41,000 images. These face images were captured in a novel and realistic setting: users playing games where a player's facial expression has an impact on the game score.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72964125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Background subtraction via coherent trajectory decomposition","authors":"Zhixiang Ren, L. Chia, D. Rajan, Shenghua Gao","doi":"10.1145/2502081.2502144","DOIUrl":"https://doi.org/10.1145/2502081.2502144","url":null,"abstract":"Background subtraction, the task to detect moving objects in a scene, is an important step in video analysis. In this paper, we propose an efficient background subtraction method based on coherent trajectory decomposition. We assume that the trajectories from background lie in a low-rank subspace, and foreground trajectories are sparse outliers in this background subspace. Meanwhile, the Markov Random Field (MRF) is used to encode the spatial coherency and trajectory consistency. With the low-rank decomposition and the MRF, our method can better handle videos with moving camera and obtain coherent foreground. Experimental results on a video dataset show our method achieves very competitive performance.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85515356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}