{"title":"Coding efficiency comparison of new video coding standards: HEVC vs VP9 vs AVS2 video","authors":"Il-Koo Kim, Sunil Lee, Yinji Piao, Jing Chen","doi":"10.1109/ICMEW.2014.6890700","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890700","url":null,"abstract":"In this paper, coding efficiency comparisons of emerging video codecs, HEVC, VP9 and AVS2 Video, are conducted. The purpose of this paper is to provide useful information about the current state-of-the-art video codecs and to help both academia and industry to get insight for developing advanced techniques on top of them. At first, design differences among the three video codecs are given briefly and then coding efficiency comparisons are conducted in random access and low delay condition. According to experimental results, HEVC outperforms VP9 and AV2 video by 24.9% and 6.5% in random access condition, respectively. In low delay condition, HEVC also outperforms VP9 and AVS2 video by 8.7% and 14.5%, respectively.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129098779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Playout buffer and DRX aware scheduling scheme for video streaming over 3GPP LTE system","authors":"Yuchen Chen, Guizhong Liu","doi":"10.1049/iet-com.2015.0134","DOIUrl":"https://doi.org/10.1049/iet-com.2015.0134","url":null,"abstract":"In this paper, a playout buffer and DRX aware scheduling scheme(PBDAS) is proposed to improve the user experience of the video streaming over the DRX enabled LTE systems. By considering the UE playout buffer level and the DRX cycle length, we define the remaining playout time, which can effectively indicate the UEs' urgency. Based on their urgency, the UEs are assigned with different forms of scheduling metrics, with which the resource allocation is accomplished. Specifically, PBDAS would allocate the radio resource to the UEs whose playout process are more likely to be interruptted. The scheme has advantages in two aspects. On one hand, the annoying playout interruptions could be effectively avoided or shortened. On the other hand, the power saving performance could be improved without deteriorating the playout continuity. The simulation results demonstrate that our algorithm can not only improve the playout continuity but also decrease the power consumption.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130132655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A class-specified learning based super resolution for low-bit-rate compressed images","authors":"Han Zhao, Xiaoguang Li, L. Zhuo","doi":"10.1109/ICMEW.2014.6890649","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890649","url":null,"abstract":"Due to limitations on the image capturing devices, distance, storage capability and bandwidth for transmission, many images in multimedia applications are low-bit-rate compressed and low resolution. In this paper, we proposed a class-specified learning based super resolution for this kind of low quality images. Firstly, we proposed a class-specified filter to remove the compressed distortions. Then a class-specified learning based scheme is employed to super resolve images with different compression rates. Experimental results show that the proposed method can improve both the objective and subjective quality of the images effectively.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117264931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How does the shape descriptor measure the perceptual quality of the retargeting image?","authors":"Lin Ma, Long Xu, H. Zeng, K. Ngan, Chenwei Deng","doi":"10.1109/ICMEW.2014.6890548","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890548","url":null,"abstract":"Perceptual quality evaluation of the retargeting image plays an important role in benchmarking different retargeting methods, as well as guiding or optimizing the retargeting process. The distortions introduced during the retargeting process are mainly categorized into shape distortion and content information loss [1]. The shape distortion measurement is critical to the evaluation of retargeting image perceptual quality. In this paper, the performances of different shape descriptors, such as PHOW [2], GIST [3], MPEG-7 descriptors [4], EMD [5], for evaluating the perceptual quality of the retargeting image are examined based on the public image retargeting subjective quality database [6]. Experimental results demonstrated that most of the shape descriptors can hardly capture the characteristics representing the quality of the retargeting image, but the global shape descriptor GIST [3] presents significant performance gains. Moreover, by incorporating with the measurements from the perspective of content information loss, a better performance is further obtained.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117330444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biao Song, Yuan Tian, M. Hassan, Atif Alamri, M. S. Hossain, Abdulhameed Alelaiwi, Bingyin Zhou
{"title":"Novel remote display method for multimedia cloud","authors":"Biao Song, Yuan Tian, M. Hassan, Atif Alamri, M. S. Hossain, Abdulhameed Alelaiwi, Bingyin Zhou","doi":"10.1109/ICMEW.2014.6890659","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890659","url":null,"abstract":"Cloud computing offers sufficient computing and storage resources that can be used to provide multimedia services. Migrating the existing multimedia service to cloud brings a new challenging issue, i.e., remote display of video contents. To reduce the bandwidth consumption especially for mobile users, it is desired to encode video before sending to client. Existing encoding methods have unique advantages and disadvantages, differing their performance under varying situations. Thus, we propose to use multi-encoder method to solve the remote display problem for remote multimedia cloud. To select the most appropriate encoder, factors including cost, application requirement, network, client device and codec implementation are considered. In this paper, we form a non-linear programming model, and take M-JPEG encoder as an example to illustrate how to apply the proposed model for getting desired optimization.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121142935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A semi-supervised temporal clustering method for facial emotion analysis","authors":"Rodrigo Araujo, M. Kamel","doi":"10.1109/ICMEW.2014.6890712","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890712","url":null,"abstract":"In this paper, we propose a semi-supervised temporal clustering method and apply it to the complex problem of facial emotion categorization. The proposed method, which uses a mechanism to add side information based on the semi-supervised kernel k-means framework, is an extension of the temporal clustering algorithm Aligned Cluster Analysis (ACA). We show that simply adding a small amount of soft constraints, in the form of must-link and cannot-link, improves the overall accuracy of the state-of-the-art method, ACA without adding any extra computational complexity. The results on the non-posed database VAM corpus for three different emotion primitives (valence, dominance, and activation) show improvements compared to the original approach.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114270720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aitor Agirre, Jorge Parra, Elisabet Estévez-Estévez, M. Marcos
{"title":"QoS aware platform for dependable sensory environments","authors":"Aitor Agirre, Jorge Parra, Elisabet Estévez-Estévez, M. Marcos","doi":"10.1109/ICMEW.2014.6890661","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890661","url":null,"abstract":"Sensory environments for healthcare are commonplace nowadays. A patient monitoring system in such an environment deals with sensor data capture, transmission and processing in order to provide on-the-spot support for monitoring the vulnerable and critical patient. A fault in such a system can be hazardous on the health of the patient. Therefore, such a system must be dependable and ensure reliability, fault-tolerance, safety and other critical aspects, in order to deploy it in real scenario. This paper encounters some of these issues and proposes a component platform to develop a flexible framework with specific support for fault tolerance and safe inter-component communication related QoS aspects. The platform adopts the Service Component Architecture (SCA) model and defines a Data Distribution Service (DDS) binding, which provides the fault tolerance and the required safety-ensuring techniques and measures, as defined in the IEC 61784-3-3 standard.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131110138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining knowledge from clicks: MSR-Bing image retrieval challenge","authors":"Xiansheng Hua, Ming Ye, Jin Li","doi":"10.1109/ICMEW.2014.6890598","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890598","url":null,"abstract":"This paper introduces the MSR-Bing grand challenge on image retrieval. The challenge is based on a dataset generated from click logs of a real image search engine. The challenge is to mine semantic knowledge from the dataset and predict the relevance score of any image-query pair. A brief introduction to the dataset, the challenge task, and the evaluation method will be presented. And then the methods proposed by the challenge participants are introduced, followed by evaluation results and some discussions about the goal and future of the challenge.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131409326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chetan S. Negi, Koyel Mandal, R. R. Sahay, M. Kankanhalli
{"title":"Super-resolution de-fencing: Simultaneous fence removal and high-resolution image recovery using videos","authors":"Chetan S. Negi, Koyel Mandal, R. R. Sahay, M. Kankanhalli","doi":"10.1109/ICMEW.2014.6890641","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890641","url":null,"abstract":"In real-world scenarios, images or videos taken at public places using inexpensive low-resolution cameras, such as smartphones are also often degraded by the presence of occlusions such as fences/barricades. Finer details in images captured using such low-end equipment are lost due to blurring and under-sampling. Compounding this problem is missing data due to the presence of an intervening occlusion between the scene and the camera such as a fence. To recover a fence-free high-resolution image, we use videos of the scene captured by panning a hand-held camera and model the effects of various degradations. Initially, we obtain the spatial locations of the fence/occlusions and the global shifts of the degraded background image. The underlying high-resolution fence-free image is modeled as a discontinuity-adaptive Markov random field and its maximum a-posteriori estimate is obtained using an optimization approach. The advantage of using this prior is that high-frequency information is preserved during the reconstruction of the super-resolved image. Specifically, we use the fast graduated non-convexity algorithm to minimize a non-convex energy function. Experiments with both synthetic and real-world data demonstrate the efficacy of the proposed algorithm.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128155119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A visually salient approach to recognize vehicles based on hierarchical architecture","authors":"Qiaochu Liu, Ruoying Jia, Zheng Shou, Xiaoran Zhan, Birong Zhang","doi":"10.1109/ICMEW.2014.6890582","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890582","url":null,"abstract":"In order to recognize multi-class vehicles, traditional methods are typically based on license plates and frontal images of vehicles. These methods rely heavily on specific datasets and thus are not applicable in real-world tasks. In this paper, we propose a novel method based on a hierarchical model, HMAX, which simulates visual architecture of primates for object recognition. It can extract features of shift-invariance and scale-invariance by Gabor filtering, template matching, and max pooling. In particular, we adopt a model of saliency-based visual attention to detect salient patches for template matching, also we drop inefficient features via an all-pairs linear SVM. During experiments, high accuracy and great efficiency are achieved on a dataset which has 31 types and over 1400 vehicle images with varying scales, orientations, and colors. With comparisons with Original-HMAX, Salient-HMAX, and Sifted-HMAX model, our method achieves classifying accuracy at 92% and time for each image at around 1.5s, while reduces 73% of the time consumed by original HMAX model.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123850165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}