C. Beecks, Klaus Schöffmann, M. Lux, M. S. Uysal, T. Seidl
{"title":"Endoscopic Video Retrieval: A Signature-Based Approach for Linking Endoscopic Images with Video Segments","authors":"C. Beecks, Klaus Schöffmann, M. Lux, M. S. Uysal, T. Seidl","doi":"10.1109/ISM.2015.21","DOIUrl":"https://doi.org/10.1109/ISM.2015.21","url":null,"abstract":"In the field of medical endoscopy more and more surgeons are changing over to record and store videos of their endoscopic procedures, such as surgeries and examinations, in long-term video archives. In order to support surgeons in accessing these endoscopic video archives in a content-based way, we propose a simple yet effective signature-based approach: the Signature Matching Distance based on adaptive-binning feature signatures. The proposed distance-based similarity model facilitates an adaptive representation of the visual properties of endoscopic images and allows for matching these properties efficiently. We conduct an extensive performance analysis with respect to the task of linking specific endoscopic images with video segments and show the high efficacy of our approach. We are able to link more than 88% of the endoscopic images to their corresponding correct video segments, which improves the current state of the art by one order of magnitude.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125094710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Unified Image Tagging System Driven by Image-Click-Ads Framework","authors":"Qiong Wu, P. Boulanger","doi":"10.1109/ISM.2015.12","DOIUrl":"https://doi.org/10.1109/ISM.2015.12","url":null,"abstract":"With the exponential growth of web image data, image tagging is becoming crucial in many image based applications such as object recognition and content-based image retrieval. Despite the great progress achieved in automatic recognition technologies, none has yet provided a satisfactory solution to be widely useful in solving generic image recognition problems. So far, only manual tagging can provide reliable tagging results. However, such work is tedious, costly and workers have no motivation. In this paper, we propose an online image tagging system, EyeDentifyIt, driven by image-click-ads framework, which motivates crowdsourcing workers as well as general web users to tag images at high quality for low cost with low workload. A series of usability studies are presented to demonstrate how EyeDentifyIt provides improved user motivations and requires less workload, compared to state-of-the-art approaches.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133660355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Location Specification and Representation in Multimedia Databases","authors":"H. Samet","doi":"10.1109/ISM.2015.128","DOIUrl":"https://doi.org/10.1109/ISM.2015.128","url":null,"abstract":"Techniques for the specification and representation of the locational component of multimedia data are reviewed. The focus is on how the locational component is specified and also on how it is represented. For the specification component we also discuss textual specifications. For the representation component, the emphasis is on a sorting approach which yields an index to the locational component where the data includes both points as well as objects with a spatial extent.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131232263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nestor Z. Salamon, Julio C. S. Jacques Junior, S. Musse
{"title":"A User-Based Framework for Group Re-Identification in Still Images","authors":"Nestor Z. Salamon, Julio C. S. Jacques Junior, S. Musse","doi":"10.1109/ISM.2015.41","DOIUrl":"https://doi.org/10.1109/ISM.2015.41","url":null,"abstract":"In this work we propose a framework for group re-identification based on manually defined soft-biometric characteristics. Users are able to choose colors that describe the soft-biometric attributes of each person belonging to the searched group. Our technique matches these structured attributes against image databases using color distance metrics, a novel adaptive threshold selection and people's proximity high level feature. Experimental results show that the proposed approach is able to help the re-identification procedure ranking the most likely results without training data, and also being extensible to work without previous images.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131651283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Development of a Cloud Based Cyber-Physical Architecture for the Internet-of-Things","authors":"K. M. Alam, Alex Sopena, Abdulmotaleb El Saddik","doi":"10.1109/ISM.2015.96","DOIUrl":"https://doi.org/10.1109/ISM.2015.96","url":null,"abstract":"Internet-of-Things (IoT) is considered as the next big disruptive technology field which main goal is to achieve social good by enabling collaboration among physical things or sensors. We present a cloud based cyber-physical architecture to leverage the Sensing as-a-Service (SenAS) model, where every physical thing is complemented by a cloud based twin cyber process. In this model, things can communicate using direct physical connections or through the cyber layer using peer-to-peer inter process communications. The proposed model offers simultaneous communication channels among groups of things by uniquely tagging each group with a relationship ID. An intelligent service layer ensures custom privacy and access rights management for the sensor owners. We also present the implementation details of an IoT platform and demonstrate its practicality by developing case study applications for the Internet-of-Vehicles (IoV) and the connected smart home.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131654296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christoph Jansen, Radek Mackowiak, N. Hezel, Moritz Ufer, Gregor Altstadt, K. U. Barthel
{"title":"Reconstructing Missing Areas in Facial Images","authors":"Christoph Jansen, Radek Mackowiak, N. Hezel, Moritz Ufer, Gregor Altstadt, K. U. Barthel","doi":"10.1109/ISM.2015.68","DOIUrl":"https://doi.org/10.1109/ISM.2015.68","url":null,"abstract":"In this paper, we present a novel approach to reconstruct missing areas in facial images by using a series of Restricted Boltzman Machines (RBMs). RBMs created with a low number of hidden neurons generalize well and are able to reconstruct basic structures in the missing areas. On the other hand networks with many hidden neurons tend to emphasize details, when using the reconstruction of the previous, more generalized RBMs, as their input. Since trained RBMs are fast in encoding and decoding data by design, our method is also suitable for processing video streams.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131684597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human-Based Video Browsing - Investigating Interface Design for Fast Video Browsing","authors":"Wolfgang Hürst, R. V. D. Werken","doi":"10.1109/ISM.2015.104","DOIUrl":"https://doi.org/10.1109/ISM.2015.104","url":null,"abstract":"The Video Browser Showdown (VBS) is an annual event where researchers evaluate their video search systems in a competitive setting. Searching in videos is often a two-step process: first some sort of pre-filtering is done, where, for example, users query an indexed archive of files, followed by a human-based browsing, where users skim the returned result set in search for the relevant file or portion of it. The VBS aims at this whole search process, focusing in particular on its interactive aspects. Encouraged by previous years' results, we created a system that purely addresses the latter issue, i.e., interface and interaction design. By eliminating all kind of video indexing and query processing, we were aiming to demonstrate the importance of good interface design for video search and that its relevance is often underestimated by today's systems. This claim is clearly proven by the results our system achieved in the VBS 2015 competition, where our approach was on a par with the top performing ones. In this paper, we will describe our system along with related design decisions, present our results from the VBS event, and discuss them in further detail.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131982040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Go Green with EnVI: the Energy-Video Index","authors":"Oche Ejembi, S. Bhatti","doi":"10.1109/ISM.2015.50","DOIUrl":"https://doi.org/10.1109/ISM.2015.50","url":null,"abstract":"Video is the most prevalent traffic type on the Internet today. Significant research has been done on measuring user's Quality of Experience (QoE) through different metrics. We take the position that energy use must be incorporated into quality metrics for digital video. We present our novel, energy-aware QoE metric for video, the Energy-Video Index (EnVI). We present our EnVI measurements from the playback of a diverse set of online videos. We observe that 4K-UHD (2160p) video can use ~30% more energy on a client device compared to HD (720p), and up to ~600% more network bandwidth than FHD (1080p), without significant improvement in objective QoE measurements.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116758338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Dickson, Chris Kondrat, Ryan B. Szeto, W. R. Adrion, Tung T. Pham, Tim D. Richards
{"title":"Portable Lecture Capture that Captures the Complete Lecture","authors":"P. Dickson, Chris Kondrat, Ryan B. Szeto, W. R. Adrion, Tung T. Pham, Tim D. Richards","doi":"10.1109/ISM.2015.22","DOIUrl":"https://doi.org/10.1109/ISM.2015.22","url":null,"abstract":"Lecture recording is not a new concept nor is high-resolution recording of multimedia presentations that include computer and whiteboard material. We describe a novel portable lecture capture system that captures not only computer content and video as do most modern lecture capture systems but also captures content from whiteboards. The white-board material is captured at high resolution and processed for clarity without the necessity for the electronic whiteboards required by many capture systems. Our presentation system also processes the entire lecture in real time. The system we present is the logical next step in lecture capture technology.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115448066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of Feature Detection in HDR Based Imaging Under Changes in Illumination Conditions","authors":"A. Rana, G. Valenzise, F. Dufaux","doi":"10.1109/ISM.2015.58","DOIUrl":"https://doi.org/10.1109/ISM.2015.58","url":null,"abstract":"High dynamic range (HDR) imaging enables to capture details in both dark and very bright regions of a scene, and is therefore supposed to provide higher robustness to illumination changes than conventional low dynamic range (LDR) imaging in tasks such as visual features extraction. However, it is not clear how much this gain is, and which are the best modalities of using HDR to obtain it. In this paper we evaluate the first block of the visual feature extraction pipeline, i.e., keypoint detection, using both LDR and different HDR-based modalities, when significant illumination changes are present in the scene. To this end, we captured a dataset with two scenes and a wide range of illumination conditions. On these images, we measure how the repeatability of either corner or blob interest points is affected with different LDR/HDR approaches. Our observations confirm the potential of HDR over conventional LDR acquisition. Moreover, extracting features directly from HDR pixel values is more effective than first tonemapping and then extracting features, provided that HDR luminance information is previously encoded to perceptually linear values.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114292006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}