{"title":"Contour-Based Depth Coding: A Subjective Quality Assessment Study","authors":"Marco Calemme, Marco Cagnazzo, B. Pesquet-Popescu","doi":"10.1109/ISM.2015.34","DOIUrl":"https://doi.org/10.1109/ISM.2015.34","url":null,"abstract":"Multi-view video plus depth is emerging as the most flexible format for 3D video representation, as witnessed by the current standardization efforts by ISO and ITU. The depth information allows synthesizing virtual view points, and for its compression various techniques have been proposed. It is generally recognized that a high quality view rendering at the receiver side is possible only by preserving the contour information since distortions on edges during the encoding step would cause a sensible degradation on the synthesized view and on the 3D perception. As a consequence recent approaches include contour-based coding of depths. However, the impact of contour-preserving depth-coding on the perceived quality of synthesized images has not been conveniently studied. Therefore in this paper we make an investigation by means of a subjective study to better understand the limits and the potentialities of the different techniques. Our results show that the contour information is indeed relevant in the synthesis step: preserving the contours and coding coarsely the rest typically leads to images that users cannot tell apart from the reference ones, even at low bit rate. Moreover, our results show that objective metrics that are commonly used to evaluate synthesized images may have a low correlation coefficient with MOS rates and are in general not consistent across several techniques and contents.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129168199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jui-Hsin Lai, Chung-Ching Lin, Chun-Fu Chen, Ching-Yung Lin
{"title":"Multi-modality Mobile Image Recognition Based on Thermal and Visual Cameras","authors":"Jui-Hsin Lai, Chung-Ching Lin, Chun-Fu Chen, Ching-Yung Lin","doi":"10.1109/ISM.2015.120","DOIUrl":"https://doi.org/10.1109/ISM.2015.120","url":null,"abstract":"The advances of mobile computing and sensor technology have turned the mobile devices into powerful instruments. The integration of thermal and visual cameras extends the capability of computer vision, due to the fact that both images reveal different characteristics in images, however, image alignment is a challenge. This paper proposes an effective approach to align image pairs for event detection on mobile through image recognition. We leverage thermal and visual cameras as multi-modality sources for image recognition. By analyzing the heat pattern, the proposed APP can identify the heating sources and help users inspect their house heating system, on the other hand, with applying image recognition, the proposed APP furthermore can help field workers identify the asset condition and provide the guidance to solve their issues.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125587984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fine-Grained Scalable Video Caching","authors":"Qiushi Gong, J. Woods, K. Kar, Jacob Chakareski","doi":"10.1109/ISM.2015.81","DOIUrl":"https://doi.org/10.1109/ISM.2015.81","url":null,"abstract":"Caching has been shown to enhance network performance. In this paper, we study fine-grain scalable video caching. We start from a single cache scenario by providing a solution to the caching allocation problem that optimizes the average expected video quality for the most popular video clips. Actual trace data is applied to verify the performance of our algorithm and compare its backhaul link bandwidth consumption relative to non-scalable video caching. In addition, we extend our analysis to collaborative caching and integrate network coding for further transmission efficiency. Our experimental results demonstrate considerable performance enhancement.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129392923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple Human Monitoring with Wireless Fiber-Optic Multimedia Sensor Networks","authors":"Qingquan Sun","doi":"10.1109/ISM.2015.123","DOIUrl":"https://doi.org/10.1109/ISM.2015.123","url":null,"abstract":"This paper presents a binary compressive sensing based fiber-optic sensor system for human monitoring. Fiber-optic sensors are flexible and convenient to measure pressure information of humans. Such a nature enables fiber-optic sensors to achieve localization and tracking directly. In order to capture more information of human subjects and scenes, a Bernoulli mixture model is proposed to model scenes. Meanwhile, compressive sensing based space encoding and decoding techniques are developed to implement scene recognition. Experimental results have demonstrated that the proposed fiber-optic sensing system and compressive sensing based encoding/decoding techniques are effective for human monitoring in terms of tracking and scene recognition.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115831976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TRACE: Linguistic-Based Approach for Automatic Lecture Video Segmentation Leveraging Wikipedia Texts","authors":"R. Shah, Yi Yu, A. Shaikh, Roger Zimmermann","doi":"10.1109/ISM.2015.18","DOIUrl":"https://doi.org/10.1109/ISM.2015.18","url":null,"abstract":"In multimedia-based e - learning systems, the accessibility and searchability of most lecture video content is still insufficient due to the unscripted and spontaneous speech of the speakers. Moreover, this problem becomes even more challenging when the quality of such lecture videos is not sufficiently high. To extract the structural knowledge of a multi-topic lecture video and thus make it easily accessible it is very desirable to divide each video into shorter clips by performing an automatic topic-wise video segmentation. To this end, this paper presents the TRACE system to automatically perform such a segmentation based on a linguistic approach using Wikipedia texts. TRACE has two main contributions: (i) the extraction of a novel linguistic-based Wikipedia feature to segment lecture videos efficiently, and (ii) the investigation of the late fusion of video segmentation results derived from state-of-the-art algorithms. Specifically for the late fusion, we combine confidence scores produced by the models constructed from visual, transcriptional, and Wikipedia features. According to our experiments on lecture videos from VideoLectures.NET and NPTEL, the proposed algorithm segments knowledge structures more accurately compared to existing state-of-the-art algorithms. The evaluation results are very encouraging and thus confirm the effectiveness of TRACE.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127773560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WAMINet: An Open Source Library for Dynamic Geospace Analysis Using WAMI","authors":"M. Maurice, Matt Piekenbrock, Derek Doran","doi":"10.1109/ISM.2015.66","DOIUrl":"https://doi.org/10.1109/ISM.2015.66","url":null,"abstract":"Modern military and commercial aerial platforms have the ability to capture imagery data over very large (kilometer wide) areas at moderate rates of 1-3 frames per second. This wide-area motion imagery (WAMI) captures the conditions and activity over a geospace, hence offering an opportunity to understand its wide-scale dynamics. This paper presents WAMINet, a library capable of ingesting large numbers of WAMI frames to build a network representation of the dynamics of the geospace being studied. It discusses the approach WAMINet uses to build the network representation, the component based design of the architecture, and illustrates its WAMI processing capabilities. Prototype versions of WAMINet and its code are available for download.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132280259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Aarabi, Benzakhar Manashirov, Edmund Phung, Kyung Moon Lee
{"title":"Precise Skin-Tone and Under-Tone Estimation by Large Photo Set Information Fusion","authors":"P. Aarabi, Benzakhar Manashirov, Edmund Phung, Kyung Moon Lee","doi":"10.1109/ISM.2015.61","DOIUrl":"https://doi.org/10.1109/ISM.2015.61","url":null,"abstract":"This paper proposes a novel method for the estimation of a person's skin-tone and under-tone by analyzing a large collection of photos of that person. By excluding badly lit images, and analyzing well-lit skin pixels, it becomes possible to compute an overall skin-tone estimate which is in-line with the person's true skin shade, and based on this, to determine a person's under-tone. Based on a study involving 15,590 user sessions and 104,366 photos, it was found that the proposed methodology can detect the normalized RGB of the person's skin-tone with 2.3% RMSE, or based on the CIE76 color difference measure, obtain an average Delta E color difference of 3.15 in L*a*b* color space.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132495574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantitative Evaluation of Hair Texture","authors":"W. Guo, P. Aarabi","doi":"10.1109/ISM.2015.43","DOIUrl":"https://doi.org/10.1109/ISM.2015.43","url":null,"abstract":"In this paper, we quantitatively evaluate the role of texture in hair patches, with a primary motivation of understanding what can be learned and applied by machine learning systems for texture-based hair detection. We evaluate the distribution of gradient directions in hair patches, and explore the relation between proximity to the face and the angle of the gradients for 2,870,000 hair patches selected from 100 manually silhouetted hairstyles.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130144691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haiwei Dong, Yu Gao, Hussein Al Osman, Abdulmotaleb El Saddik
{"title":"Development of a Web-Based Haptic Authoring Tool for Multimedia Applications","authors":"Haiwei Dong, Yu Gao, Hussein Al Osman, Abdulmotaleb El Saddik","doi":"10.1109/ISM.2015.71","DOIUrl":"https://doi.org/10.1109/ISM.2015.71","url":null,"abstract":"In this paper, we introduce an MPEG-V based haptic authoring tool intended for simplifying the development process of haptics-enabled multimedia applications. The developed tool provides a web-based interface for users to create haptic environments by importing 3D models and adding haptic properties to them. The user can then export the resulting environment to a standard MPEG-V format. The latter can be imported to a haptic player that renders the described haptics-enabled 3D scene. The proposed tool can support many haptic devices, including Geomagic Devices, Force Dimension Devices, Novint Falcon Devices, and Moog FCS HapticMaster Devices. We conduct a proof of concept HTML5 haptic game project and user studies on haptic effects, development process and user interface, which shows our tool's effectiveness in simplifying the development process of haptics-enabled multimedia applications.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131586448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Normalized Gaussian Distance Graph Cuts for Image Segmentation","authors":"Chengcai Leng, W. Xu, I. Cheng, Z. Xiong, A. Basu","doi":"10.1109/ISM.2015.36","DOIUrl":"https://doi.org/10.1109/ISM.2015.36","url":null,"abstract":"This paper presents a novel, fast image segmentation method based on normalized Gaussian distance on nodes in conjunction with normalized graph cuts. We review the equivalence between kernel k-means and normalized cuts. Then we extend the framework of efficient spectral clustering and avoid choosing weights in the weighted graph cuts approach. Experiments on synthetic data sets and real-world images demonstrate that the proposed method is effective and accurate.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129799075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}