J. Bulat, L. Janowski, Dawid Juszka, M. Socha, M. Grega, Z. Papir
{"title":"Evaluation of Crosstalk Metrics for 3D Display Technologies with Respect to Temporal Luminance Analysis","authors":"J. Bulat, L. Janowski, Dawid Juszka, M. Socha, M. Grega, Z. Papir","doi":"10.1109/ISM.2011.90","DOIUrl":"https://doi.org/10.1109/ISM.2011.90","url":null,"abstract":"Cross talk is one of the most important parameters of the 3D displays' quality. Different cross talk definitions exist, which makes cross talk measurement and comparison difficult. We take a step back and focus on a detailed 3D display luminance analysis. The conclusions we draw from the temporal luminance analysis can be used to propose an effective approach to cross talk measurements. In scope of the presented work we have measured four different 3D displays.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128852689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Caching for Location-Aware Image Queries in Mobile Networks","authors":"Bo Yang, M. Manohar","doi":"10.1109/ISM.2011.74","DOIUrl":"https://doi.org/10.1109/ISM.2011.74","url":null,"abstract":"Caching has been widely used in mobile networks to improve system performance. However, conventional caching methodologies have two major drawbacks in dealing with spatial queries in a dynamic mobile network: (i) the description of cached data is defined based on the query context instead of data content ignoring the spatial or semantic locality of the data. (ii) the description of cached data does not reflect the popularity of the data, making it inefficient in providing QoS-related services. To address these issues, we propose a location-aware caching (LAC) model which reflects the distribution of images based on the analysis of earlier queries. The novelty of our method stems from several factors including: 1) describing the image data distribution based on a Hilbert space-filling curve, 2) optimizing spatial query resolution through efficient exploitation of locally cached data, and 3) reducing the cost of query resolution with restricted search scope. Through extensive simulations, we show that our model can perform spatial search with less cost. In addition, it is scalable to large environments and voluminous data.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128059409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Arndt, Jan-Niklas Antons, R. Schleicher, S. Möller, Simon Scholler, G. Curio
{"title":"A Physiological Approach to Determine Video Quality","authors":"S. Arndt, Jan-Niklas Antons, R. Schleicher, S. Möller, Simon Scholler, G. Curio","doi":"10.1109/ISM.2011.91","DOIUrl":"https://doi.org/10.1109/ISM.2011.91","url":null,"abstract":"Video quality has turned out to be a crucial aspect of multimodal transmission services. Most common video quality tests rely on a conscious judgment of test participants reflecting their internal quality perception. But it is not completely clear how this conscious rating is formed, neither in the auditory nor in the visual domain. Initial audio tests with Electroencephalography (EEG) have shown that EEG recordings can be used as a sensitive and non-intrusive method for quality assessment. In this paper we conducted first experiments of pure video quality tests with EEG to complement this approach in the visual domain. One of the goals for this experiment was to show that there is a different pattern in the EEG data for cases with no distortion compared to cases when there was a distortion and the subject recognized this in the subjective test.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133080631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Richter, S. Rudlof, B. Adjibadji, Heiko Bernlöhr, Christoph Grüninger, C. Munz, A. Stock, Christian Rohde, R. Helmig
{"title":"ViPLab - A Virtual Programming Laboratory for Mathematics and Engineering","authors":"T. Richter, S. Rudlof, B. Adjibadji, Heiko Bernlöhr, Christoph Grüninger, C. Munz, A. Stock, Christian Rohde, R. Helmig","doi":"10.1109/ISM.2011.95","DOIUrl":"https://doi.org/10.1109/ISM.2011.95","url":null,"abstract":"In the process of the implementation of the eBologna program of the European states and the recent change of the German university system from the Diploma to the Bachelor/Master system, studies at German universities have been redesigned, courses have been condensed and learning content has been re-structured into granular \"modules\", each of which requires an evaluation at the end of the semester. Simultaneously, the skills required for working as an engineer changed as well, handling of computer software, knowledge of mathematical or numerical algorithms and programming skills play an increasingly important role in the daily job routine of the working engineer. To support the learning by practical exercises, engineering faculties, mathematics and physics, and the Computing Center of the University of Stuttgart setup a project for implementing an online programming lab for teaching the required skills. The focus of this project is to provide easy access to the necessary software tools, avoid the overhead of installation and maintenance, and seamlessly integrate these tools into the eLearning infrastructure of the university. This paper describes the motivation and backgrounds, the software infrastructure and early results of this project.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134231525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Qualitative Monitoring of Video Quality of Experience","authors":"P. Pérez, Jesús Gutiérrez, J. Ruiz, N. García","doi":"10.1109/ISM.2011.83","DOIUrl":"https://doi.org/10.1109/ISM.2011.83","url":null,"abstract":"Real-time monitoring of multimedia Quality of Experience is a critical task for the providers of multimedia delivery services: from television broadcasters to IP content delivery networks or IPTV. For such scenarios, meaningful metrics are required which can generate useful information to the service providers that overcome the limitations of pure Quality of Service monitoring probes. However, most of objective multimedia quality estimators, aimed at modeling the Mean Opinion Score, are difficult to apply to massive quality monitoring. Thus we propose a lightweight and scalable monitoring architecture called Qualitative Experience Monitoring (QuEM), based on detecting identifiable impairment events such as the ones reported by the customers of those services. We also carried out a subjective assessment test to validate the approach and calibrate the metrics. Preliminary results of this test set support our approach.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127850757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Kumagai, Keisuke Doman, Tomokazu Takahashi, Daisuke Deguchi, I. Ide, H. Murase
{"title":"Detection of Inconsistency Between Subject and Speaker Based on the Co-occurrence of Lip Motion and Voice Towards Speech Scene Extraction from News Videos","authors":"S. Kumagai, Keisuke Doman, Tomokazu Takahashi, Daisuke Deguchi, I. Ide, H. Murase","doi":"10.1109/ISM.2011.56","DOIUrl":"https://doi.org/10.1109/ISM.2011.56","url":null,"abstract":"We propose a method to detect the inconsistency between a subject and the speaker for extracting speech scenes from news videos. Speech scenes in news videos contain a wealth of multimedia information, and are valuable as archived material. In order to extract speech scenes from news videos, there is an approach that uses the position and size of a face region. However, it is difficult to extract them with only such approach, since news videos contain non-speech scenes where the speaker is not the subject, such as narrated scenes. To solve this problem, we propose a method to discriminate between speech scenes and narrated scenes based on the co-occurrence between a subject's lip motion and the speaker's voice. The proposed method uses lip shape and degree of lip opening as visual features representing a subject's lip motion, and uses voice volume and phoneme as audio feature representing a speaker's voice. Then, the proposed method discriminates between speech scenes and narrated scenes based on the correlations of these features. We report the results of experiments on videos captured in a laboratory condition and also on actual broadcast news videos. Their results showed the effectiveness of our method and the feasibility of our research goal.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124184761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Mertens, Sebastian Pospiech, C. Wiesen, M. Ketterl
{"title":"The Tempo-Structural Fisheye Slider for Navigation in Web Lectures","authors":"R. Mertens, Sebastian Pospiech, C. Wiesen, M. Ketterl","doi":"10.1109/ISM.2011.98","DOIUrl":"https://doi.org/10.1109/ISM.2011.98","url":null,"abstract":"Time-based slider interfaces haven proven to be an intuitive and effective means for navigation in web lectures and other video content. They allow users to easily navigate to arbitrary positions in the video and clearly visualize the video's linear structure. They do, however, lack any contextual information about the video's content at these positions. Earlier approaches have tackled this shortcoming by visualizing slide titles or slide previews on the timeline. The approach presented in this paper goes one step further in that it links the current slider position to a fisheye-style view of the lecture slide overview. The fisheye shows the slide playing at the current position in full detail and even in the corresponding animation step -- if animated. It also shows the neighboring slides' previews and visually maps them to the timeline, thus providing contextual information on the current slider position while still maintaining a global overview.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124884332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ângelo Magno de Jesus, S. Guimarães, Zenilton K. G. Patrocínio
{"title":"Video text extraction based on image regularization and temporal analysis","authors":"Ângelo Magno de Jesus, S. Guimarães, Zenilton K. G. Patrocínio","doi":"10.1109/ISM.2011.55","DOIUrl":"https://doi.org/10.1109/ISM.2011.55","url":null,"abstract":"Video text extraction is the process of identifying embedded text on video, which is usually on complex background. This paper proposes a new approach to cope with this problem considering image regularization and temporal information. The former helps us to decrease the number of gray values in order to simplify the image content, and the second one takes advantage of video text persistence in order to identify video segments ignoring text changes. According to our experiments, the proposed method presents better results than other. Moreover, we propose a post-processing step for improving the text results obtained by Otsu method.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128790468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introducing the Discrete Morphlet Transform and its Applications for Voice Conversion","authors":"L. S. Vieira, R. Guido, Shi-Huang Chen","doi":"10.1109/ISM.2011.93","DOIUrl":"https://doi.org/10.1109/ISM.2011.93","url":null,"abstract":"This paper introduces Morph let, a new wavelet transform adapted for voice conversion purposes. The paradigm of joint time-frequency-shape analysis of discrete-time signals, possible by means of the Discrete Shape let Transform (DST), is the basis used for the construction of Morph lets. The results assure the efficacy of the proposed transform, which is able, by itself and with the help of no other tool such as a neural network, to carry out the task, totally.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129844301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"User Behavior in the \"teleda\" Social TV Service - Achieving a Public Forum on the Network","authors":"Masaru Miyazaki, N. Hamaguchi, H. Fujisawa","doi":"10.1109/ISM.2011.63","DOIUrl":"https://doi.org/10.1109/ISM.2011.63","url":null,"abstract":"The spread of Video on Demand (VOD) services is driving the creation of video viewing environments in which users can watch whatever they like, whenever they like. The recent appearance of social network services (SNSs), moreover, is bringing big changes to the world of media by enabling anyone to become a disseminator of information. At NHK, Japan 's public broadcaster, we are studying a platform that combines VOD and SNS to create ghorizontal linksh between program viewers and facilitate encounters with new programs. To investigate the requirements for this platform, we built a SNS site called gteledah that enables program viewing by VOD and conducted a large-scale, three-month verification trial with about 1000 participants. We report on the features of viewing behavior obtained from the results of this trial and discuss the potential of public broadcasting services that combine VOD and SNS.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125850494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}