Marcus Ständer, Aristotelis Hadjakos, Niklas Lochschmidt, Christian Klos, B. Renner, M. Mühlhäuser
{"title":"A Smart Kitchen Infrastructure","authors":"Marcus Ständer, Aristotelis Hadjakos, Niklas Lochschmidt, Christian Klos, B. Renner, M. Mühlhäuser","doi":"10.1109/ISM.2012.27","DOIUrl":"https://doi.org/10.1109/ISM.2012.27","url":null,"abstract":"In the future our homes will be more and more equipped with sensing and interaction devices that will make new multimedia experiences possible. These experiences will not necessarily be bound to the TV, tabletop, smart phone, tablet or desktop computer but will be embedded in our everyday surroundings. In order to enable new forms of interaction, we equipped an ordinary kitchen with a large variety of sensors according to best practices. An innovation in comparison to related work is our Information Acquisition System that allows monitoring and controlling kitchen appliances remotely. This paper presents our sensing infrastructure and novel interactions in the kitchen that are enabled by the Information Acquisition System.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129328519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AudioAlign - Synchronization of A/V-Streams Based on Audio Data","authors":"Mario Guggenberger, M. Lux, L. Böszörményi","doi":"10.1109/ISM.2012.79","DOIUrl":"https://doi.org/10.1109/ISM.2012.79","url":null,"abstract":"Manual synchronization of audio and video recordings is a very annoying and time consuming task, especially if the tracks are very long and/or of large quantity. If the tracks aren't just short clips (of a few seconds or minutes) and recorded from heterogeneous sources, an additional problem comes into play - time drift - which arises if different recording devices aren't synchronized. This demo paper presents the experimental software Audio Align, which aims to simplify the manual synchronization process with the ultimate goal to automate it altogether. It gives a short introduction to the topic, discusses the approach, method, implementation and preliminary results and gives an outlook at possible improvements.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116056541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"JIRL - A C++ Library for JPEG Compressed Domain Image Retrieval","authors":"David Edmundson, G. Schaefer","doi":"10.1109/ISM.2012.48","DOIUrl":"https://doi.org/10.1109/ISM.2012.48","url":null,"abstract":"In this paper we present JIRL, an open source C++ software suite that allows to perform content-based image retrieval in the JPEG compressed domain. We provide implementations of nine retrieval algorithms representing the current state-of-the-art. For each algorithm, methods for compressed domain feature extraction as well as feature comparison are provided in an object-oriented framework. In addition, our software suite includes functionality for benchmarking retrieval algorithms in terms of retrieval performance and retrieval time. An example full image retrieval application is also provided to demonstrate how the library can be used. JIRL is made available to fellow researchers under the LGPL.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124981340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GPU Hierarchical Quilted Self Organizing Maps for Multimedia Understanding","authors":"Y. Nashed","doi":"10.1109/ISM.2012.102","DOIUrl":"https://doi.org/10.1109/ISM.2012.102","url":null,"abstract":"It is well established that the human brain outperforms current computers, concerning pattern recognition tasks, through the collaborative processing of simple building units (neurons). In this work we expand an abstracted model of the neocortex called Hierarchical Quilted Self Organizing Map, benefiting from the parallel power of current Graphical Processing Units, to achieve realtime understanding and classification of spatio-temporal sensory information. We also propose an improvement on the original model that allows the learning rate to be automatically adapted according to the input training data available. The overall system is tested on the task of gesture recognition from a Microsoft Kinect publicly available dataset.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126042133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ARtifact: Tablet-Based Augmented Reality for Interactive Analysis of Cultural Artifacts","authors":"D. Vanoni, M. Seracini, F. Kuester","doi":"10.1109/ISM.2012.17","DOIUrl":"https://doi.org/10.1109/ISM.2012.17","url":null,"abstract":"To ensure the preservation of cultural heritage, artifacts such as paintings must be analyzed to diagnose physical frailties that could result in permanent damage. Advancements in digital imaging techniques and computer-aided analysis have greatly aided in such diagnoses but can limit the ability to work directly with the artifact in the field. This paper presents the implementation and application of ARtifact, a tablet-based augmented reality system that enables on-site visual analysis of the artifact in question. Utilizing real-time tracking of the artifact under observation, a user interacting with the tablet can study various layers of data registered with the physical object in situ. Theses layers, representing data acquired through various imaging modalities such as infrared thermography and ultraviolet fluorescence, provide the user with an augmented view of the artifact to aid in on-site diagnosis and restoration. Intuitive interaction techniques further enable targeted analysis of artifact-related data. We present a case study utilizing our tablet system to analyze a 16th century Italian hall and highlight the benefits of our approach.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"766 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132969869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face Recognition Using Discrete Tchebichef-Krawtchouk Transform","authors":"Wissam A. Jassim, Paramesran Raveendran","doi":"10.1109/ISM.2012.31","DOIUrl":"https://doi.org/10.1109/ISM.2012.31","url":null,"abstract":"In this paper, a face recognition system based on Discrete Tchebichef-Krawtchouk Transform DTKT and Support Vector Machines SVMs is proposed. The objective of this paper is to present the following: (1) the mathematical and theoretical frameworks for the definition of the DTKT including transform equations that need to be addressed. (2) the DTKT features used in the classification of faces. (3) results of empirical tests that compare the representational capabilities of this transform with other types of discrete transforms such as Discrete Tchebichef transform DTT, discrete Krawtchouk Transform DKT, and Discrete Cosine transform DCT. The system is tested on a large number of faces collected from ORL and Yale face databases. Empirical results show that the proposed transform gives very good overall accuracy under clean and noisy conditions.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130483070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3D Scene Generation by Learning from Examples","authors":"Mesfin Dema, H. Sari-Sarraf","doi":"10.1109/ISM.2012.19","DOIUrl":"https://doi.org/10.1109/ISM.2012.19","url":null,"abstract":"Due to overwhelming use of 3D models in video games and virtual environments, there is a growing interest in 3D scene generation, scene understanding and 3D model retrieval. In this paper, we introduce a data-driven 3D scene generation approach from a Maximum Entropy (MaxEnt) model selection perspective. Using this model selection criterion, new scenes can be sampled by matching a set of contextual constraints that are extracted from training and synthesized scenes. Starting from a set of random synthesized configurations of objects in 3D, the MaxEnt distribution is iteratively sampled (using Metropolis sampling) and updated until the constraints between training and synthesized scenes match, indicating the generation of plausible synthesized 3D scenes. To illustrate the proposed methodology, we use 3D training desk scenes that are all composed of seven predefined objects with different position, scale and orientation arrangements. After applying the MaxEnt framework, the synthesized scenes show that the proposed strategy can generate reasonably similar scenes to the training examples without any human supervision during sampling. We would like to mention, however, that such an approach is not limited to desk scene generation as described here and can be extended to any 3D scene generation problem.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122071912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting JPEG Compression for Image Retrieval","authors":"David Edmundson, G. Schaefer","doi":"10.1109/ISM.2012.99","DOIUrl":"https://doi.org/10.1109/ISM.2012.99","url":null,"abstract":"Content-based image retrieval (CBIR) has been an active research area for many years, yet much of the research ignores the fact that most images are stored in compressed form which affects retrieval both in terms of processing speed and retrieval accruacy. In this paper, we address various aspects of JPEG compressed images in the context of image retrieval. We first analyse the effect of JPEG quantisation on image retrieval and present a robust method to address the resulting performance drop. We then compare various retrieval methods that work in the JPEG compressed domain and finally propose two new methods that are based solely on information available in the JPEG header. One of these is using optimised Huffman tables for retrieval, while the other is based on tuned quantisation tables. Both techniques are shown to give retrieval performance comparable to existing methods while being magnitudes faster.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126194474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mutual Information Based Stereo Correspondence in Extreme Cases","authors":"Qing Tian, GuangJun Tian","doi":"10.1109/ISM.2012.46","DOIUrl":"https://doi.org/10.1109/ISM.2012.46","url":null,"abstract":"Stereo correspondence is an ill-posed problem mainly due to matching ambiguity, which is especially serious in extreme cases where the corresponding relationship is unknown and can be very complicated. Mutual information (MI), which assumes no prior relationship on the matching pair, is a good solution to this problem. This paper proposes a context-aware mutual information and Markov Random Field (MRF) based approach with gradient information introduced into both the data term and the smoothness term of the MAP-MRF framework where such advanced techniques as graph cuts can be used to find an accurate disparity map. The results show that the proposed context-aware method outperforms non-MI and traditional MI-based methods both quantitatively and qualitatively in some extreme cases.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126438175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Motion-Sketch Based Video Retrieval Using MST-CSS Representation","authors":"C. Chattopadhyay, Sukhendu Das","doi":"10.1109/ISM.2012.76","DOIUrl":"https://doi.org/10.1109/ISM.2012.76","url":null,"abstract":"In this work, we propose a framework for a robust Content Based Video Retrieval (CBVR) system with free hand query sketches, using the Multi-Spectro Temporal-Curvature Scale Space (MST-CSS) representation. Our designed interface allows sketches to be drawn to depict the shape of the object in motion and its trajectory. We obtain the MST-CSS feature representation using these cues and match with a set of MST-CSS features generated offline from the video clips in the database (gallery). Results are displayed in rank ordered similarity. Experimentation with benchmark datasets shows promising results.","PeriodicalId":282528,"journal":{"name":"2012 IEEE International Symposium on Multimedia","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126537211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}