Mohan Liu, Ioannis Mademlis, P. Ndjiki-Nya, Jean-Charles Le Quintrec, N. Nikolaidis, I. Pitas
{"title":"Efficient automatic detection of 3D video artifacts","authors":"Mohan Liu, Ioannis Mademlis, P. Ndjiki-Nya, Jean-Charles Le Quintrec, N. Nikolaidis, I. Pitas","doi":"10.1109/MMSP.2014.6958787","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958787","url":null,"abstract":"This paper summarizes some common artifacts in stereo video content. These artifacts lead to poor even uncomfortable 3D viewing experience. Efficient approaches for detecting three typical artifacts, sharpness mismatch, synchronization mismatch and stereoscopic window violation, are presented in detail. Sharpness mismatch is estimated by measuring the width deviations of edge pairs in depth planes. Synchronization mismatch is detected based on the motion inconsistencies of feature points between the stereoscopic channels in a short time frame. Stereoscopic window violation is detected, using connected component analysis, when objects hit the vertical frame boundaries while being in front of the virtual screen. For experiments, test sequences were created in a professional studio environment and state-of-the-art metrics were used for evaluating the proposed approaches. The experimental results show that our algorithms have considerable robustness in detecting 3D defects.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132951371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Hossfeld, Matthias Hirth, Pavel Korshunov, Philippe Hanhart, B. Gardlo, Christian Keimel, C. Timmerer
{"title":"Survey of web-based crowdsourcing frameworks for subjective quality assessment","authors":"T. Hossfeld, Matthias Hirth, Pavel Korshunov, Philippe Hanhart, B. Gardlo, Christian Keimel, C. Timmerer","doi":"10.1109/MMSP.2014.6958831","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958831","url":null,"abstract":"The popularity of the crowdsourcing for performing various tasks online increased significantly in the past few years. The low cost and flexibility of crowdsourcing, in particular, attracted researchers in the field of subjective multimedia evaluations and Quality of Experience (QoE). Since online assessment of multimedia content is challenging, several dedicated frameworks were created to aid in the designing of the tests, including the support of the testing methodologies like ACR, DCR, and PC, setting up the tasks, training sessions, screening of the subjects, and storage of the resulted data. In this paper, we focus on the web-based frameworks for multimedia quality assessments that support commonly used crowdsourcing platforms such as Amazon Mechanical Turk and Microworkers. We provide a detailed overview of the crowdsourcing frameworks and evaluate them to aid researchers in the field of QoE assessment in the selection of frameworks and crowdsourcing platforms that are adequate for their experiments.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127402125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Background subtraction under sudden illumination change","authors":"Hasan Sajid, S. Cheung","doi":"10.1109/MMSP.2014.6958814","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958814","url":null,"abstract":"In this paper, we propose a Multiple Background Model based Background Subtraction (MB2S) algorithm that is robust against sudden illumination changes in indoor environment. It uses multiple background models of expected illumination changes followed by both pixel and frame based background subtraction on both RGB and YCbCr color spaces. The masks generated after processing these input images are then combined in a framework to classify background and foreground pixels. Evaluation of proposed approach on publicly available test sequences show higher precision and recall than other state-of-the-art algorithms.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129095055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng Yang, Yu Mao, Gene Cheung, V. Stanković, Kevin Chan
{"title":"Graph-based depth video denoising and event detection for sleep monitoring","authors":"Cheng Yang, Yu Mao, Gene Cheung, V. Stanković, Kevin Chan","doi":"10.1109/MMSP.2014.6958802","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958802","url":null,"abstract":"Quality of sleep greatly affects a person's physiological well-being. Traditional sleep monitoring systems are expensive in cost and intrusive enough that they disturb the natural sleep of clinical patients. In our previous work, we proposed a non-intrusive sleep monitoring system to first record depth video in real-time, then offline analyze recorded depth data to track a patient's chest and abdomen movements over time. Detection of abnormal breathing is then interpreted as episodes of apnoea or hypopnoea. Leveraging on recent advances in graph signal processing (GSP), in this paper we propose two new additions to further improve our sleep monitoring system. First, temporal denoising is performed using a block motion vector smoothness prior expressed in the graph-signal domain, so that unwanted temporal flickering can be removed. Second, a graph-based event classification scheme is proposed, so that detection of apnoea / hypopnoea can be performed accurately and robustly. Experimental results show first that graph-based temporal denoising scheme outperforms an implementation of temporal median filter in terms of flicker removal. Second, we show that our graph-based event classification scheme is noticeably more robust to errors in training data than two conventional implementations of support vector machine (SVM).","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114179433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal detector for camera model identification based on an accurate model of DCT coefficients","authors":"T. H. Thai, R. Cogranne, F. Retraint","doi":"10.1109/MMSP.2014.6958810","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958810","url":null,"abstract":"The goal of this paper is to design a statistical test for the camera model identification problem. The approach is based on the state-of-the-art model of Discret Cosine Transform (DCT) coefficients to capture their statistical difference, which jointly results from different sensor noises and in-camera processing algorithms. The noise model parameters are considered as camera fingerprint to identify camera models. The camera model identification problem is cast in the framework of hypothesis testing theory. In an ideal context where all model parameters are perfectly known, this paper studies the optimal detector given by the Likelihood Ratio Test (LRT) and analytically establishes its statistical performances. In practice, a Generalized LRT is designed to deal with the difficulty of unknown parameters such that it can meet a prescribed false alarm probability while ensuring a high detection performance. Numerical results on simulated database and natural JPEG images highlight the relevance of the proposed approach.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115788194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ning Lin, Ping-Chia Tsai, Yu-An Chen, Homer H. Chen
{"title":"Music recommendation based on artist novelty and similarity","authors":"Ning Lin, Ping-Chia Tsai, Yu-An Chen, Homer H. Chen","doi":"10.1109/MMSP.2014.6958801","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958801","url":null,"abstract":"Most existing systems recommend songs to the user based on the popularity of songs and singers. However, the system proposed in this paper is driven by an emerging and somewhat different need in the music industry-promoting new talents. The system recommends songs based on the novelty of singers (or artists) and their similarity to the user's favorite artists. Novel artists whose popularity is on the rise have a higher priority to be recommended. Specifically, given a user's favorite artists, the system first determines the candidate artists based on their similarity with the favorite artists and then selects those who have a higher novelty score than the favorite artists. Then, the system outputs a playlist composed of the most popular songs of the selected artists. The proposed system can be integrated into most existing systems. Its performance is evaluated using the Spotify Radio Recommender as a reference and a pool of 100 subjects recruited on campus. Experimental results show that our system achieves a high novelty score and a competitive user-preference score.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"352 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115976926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive low complexity colour transform for video coding","authors":"R. Weerakkody, M. Mrak","doi":"10.1109/MMSP.2014.6958820","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958820","url":null,"abstract":"For video compression, the RGB signals are usually converted at the source to a perceptual colour space, followed by chroma sub-sampling, for coding efficiency. This is based on the typically higher human visual system sensitivity to the luminance than chrominance of image signals. However, there are specific applications that demand carrying the full RGB signals through the transmission chain, which may also benefit from lossless colour transforms, for efficient coding. In either case, the best colour transform function is noted to be content dependent, although fixed transforms are typically adopted for convenience. This paper presents a method of dynamically adapting this colour transform function for each picture block, using a class of low complexity lifting based schemes. The performance of the proposed algorithm is compared with a number of fixed colour transform schemes and shows a significant compression gain over native RGB coding and YCoCg transform.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126921907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Pinheiro, K. Fliegel, Pavel Korshunov, Lukáš Krasula, Marco V. Bernardo, Maria Pereira, T. Ebrahimi
{"title":"Performance evaluation of the emerging JPEG XT image compression standard","authors":"A. Pinheiro, K. Fliegel, Pavel Korshunov, Lukáš Krasula, Marco V. Bernardo, Maria Pereira, T. Ebrahimi","doi":"10.1109/MMSP.2014.6958834","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958834","url":null,"abstract":"The upcoming JPEG XT is under development for High Dynamic Range (HDR) image compression. This standard encodes a Low Dynamic Range (LDR) version of the HDR image generated by a Tone-Mapping Operator (TMO) using the conventional JPEG coding as a base layer and encodes the extra HDR information in a residual layer. This paper studies the performance of the three profiles of JPEG XT (referred to as profiles A, B and C) using a test set of six HDR images. Four TMO techniques were used for the base layer image generation to assess the influence of the TMOs on the performance of JPEG XT profiles. Then, the HDR images were coded with different quality levels for the base layer and for the residual layer. The performance of each profile was evaluated using Signal to Noise Ratio (SNR), Feature SIMilarity Index (FSIM), Root Mean Square Error (RMSE), and CIEDE2000 color difference objective metrics. The evaluation results demonstrate that profiles A and B lead to similar saturation of quality at the higher bit rates, while profile C exhibits no saturation. Profiles B and C appear to be more dependent on TMOs used for the base layer compared to profile A.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115435588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic high dynamic range hallucination in inverse tone mapping","authors":"Pin-Hung Kuo, Huai-Jen Liang, Chi-Sun Tang, Shao-Yi Chien","doi":"10.1109/MMSP.2014.6958828","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958828","url":null,"abstract":"Nowadays the dynamic range of displays has been higher and higher, which means that contents can be recorded and displayed with more detail. However, the original low dynamic range contents were recorded in a lower dynamic range. Such contents will be unsatisfying compared to high dynamic range contents, especially in the saturated, or overexposed region. This paper proposes an algorithm to compensate such exposed regions, which is called automatic high dynamic range image hallucination for inverse tone mapping. Inverse tone-mapping is the process of creating a high dynamic range image from a single low dynamic range image. In this work, high dynamic range image hallucination is used as the key method to reproduce the information which is lost in the low dynamic range image capturing. Previous methods require user interaction as a hallucination criteria, and is not practical in some applications where user interaction is not available. In this paper, the hallucination is performed automatically with the assistance of luminance and texture decoupling process. This scheme produces visually satisfying results and has the potential to be applied to video inverse tone-mapping with its automatic property.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116106714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hong Lin, L. Chaisorn, Yongkang Wong, Anan Liu, Yuting Su, M. Kankanhalli
{"title":"View-invariant feature discovering for multi-camera human action recognition","authors":"Hong Lin, L. Chaisorn, Yongkang Wong, Anan Liu, Yuting Su, M. Kankanhalli","doi":"10.1109/MMSP.2014.6958807","DOIUrl":"https://doi.org/10.1109/MMSP.2014.6958807","url":null,"abstract":"Intelligent video surveillance system is built to automatically detect events of interest, especially on object tracking and behavior understanding. In this paper, we focus on the task of human action recognition under surveillance environment, specifically in a multi-camera monitoring scene. Despite many approaches have achieved success in recognizing human action from video sequences, they are designed for single view and generally not robust against viewpoint invariant. Human action recognition across different views remains challenging due to the large variations from one view to another. We present a framework to solve the problem of transferring action models learned in one view (source view) to another view (target view). First, local space-time interest point feature and global shape-flow feature are extracted as low-level feature, followed by building the hybrid Bag-of-Words model for each action sequence. The data distribution of relevant actions from source view and target view are linked via a cross-view discriminative dictionary learning method. Through the view-adaptive dictionary pair learned by the method, the data from source and target view can be respectively mapped into a common space which is view-invariant. Furthermore, We extend our framework to transfer action models from multiple views to one view when there are multiple source views available. Experiments on the IXMAS human action dataset, which contains videos captured with five viewpoints, show the efficacy of our framework.","PeriodicalId":164858,"journal":{"name":"2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)","volume":"17 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122375682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}