{"title":"Examining Memory in Reconstruction Distortion: Dropping Additional Packets to Improve Video Quality","authors":"Jacob Chakareski, J. Apostolopoulos","doi":"10.1109/MMSP.2005.248601","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248601","url":null,"abstract":"The source coding process and the packet loss process create certain dependencies between encoded video units in terms of the reconstruction distortion of the video signal at the receiver in case of transmission over packet erasure channels. In this paper, we examine the importance of this \"distortion memory\" via a specific class of memory-based models denoted Distortion Chains that are used for predicting the distortion of the reconstructed video signal in case of missing multiple packets at the receiver. We show that taking into account even the smallest amount of memory that is possible can yield substantial gains in terms of prediction accuracy and packet selection (packet dropping) performance. An additional and rather surprising result of our study is the fact that in certain situations dropping an additional video packet (which could otherwise be delivered) can actually improve the quality of the reconstructed video","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130795062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Object Highlighting and Tracking in a Novel VideoGIS System for Telematics","authors":"Chih-Wei Huang, JaeJun Yoo, Sung-Hwan Jung, Kyoung-Ho Choi, Jenq-Neng Hwang","doi":"10.1109/MMSP.2005.248587","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248587","url":null,"abstract":"A VideoGIS system combining geo-referenced video information with conventional geographic information (GI) is developed to provide a more comprehensive understanding over a spatial area. In our on-going project, the hypermedia can be transmitted to GPS-guided vehicles in a scalable (layered) fashion while providing highlighting and tracking of landmark objects on video upon drivers' request. Special efforts on GPS error calibration and target objects tracking are reported in this paper","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129884508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal Channel-Time Allocation for the Transmission of Multiple Video Streams Over a Shared Channel","authors":"M. Kalman, B. Girod","doi":"10.1109/MMSP.2005.248588","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248588","url":null,"abstract":"When multiple video streams are delivered from a single server to multiple clients over a time-shared channel that may support differing transmission speeds depending on the client, the problem arises of how to optimally divide channel time among the streams so that the overall quality of the decoded videos is maximized. This paper presents solutions to this allocation problem for three optimization objectives: minimizing the MSE of the decoded video over all the streams, maximizing the PSNR over all the streams, and minimizing the maximum MSE (equivalent to maximizing the minimum PSNR). Each is shown to be a convex problem which can be solved quickly using standard techniques. This paper then investigates the question of which objective should be used. Subjective tests were conducted to determine how well allocations made according to the various objectives correspond to the preferences of human test subjects. Results indicate, surprisingly, that the minimum MSE objective corresponds best to subjective preferences","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114465782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image Mosaicking Through Macro-Feature Affine Matching","authors":"L. Lucchese, Simone Leorin, G. Cortelazzo","doi":"10.1109/MMSP.2005.248650","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248650","url":null,"abstract":"This paper presents a new method for registering images related by 2-D affine transformations. The method is based on extracting macro-features from the images to register and matching the polar curves associated with their energies, defined as the squared Fourier transform magnitudes. Such matching is formulated as a simple minimization problem whose optimal solution if found with the Levenberg-Marquardt algorithm. The excellent performance of the algorithm is shown through a practical example of image mosaicking","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125597774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duan-Yu Chen, H. Liao, Hsiao-Rong Tyan, Chia-Wen Lin
{"title":"Automatic Key Posture Selection for Human Behavior Analysis","authors":"Duan-Yu Chen, H. Liao, Hsiao-Rong Tyan, Chia-Wen Lin","doi":"10.1109/MMSP.2005.248572","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248572","url":null,"abstract":"A novel human posture analysis framework that can perform automatic key posture selection and template matching for human behavior analysis is proposed. The entropy measurement, which is commonly adopted as an important feature to describe the degree of disorder in thermodynamics, is used as an underlying feature for identifying key postures. First, we use cumulative entropy change as an indicator to select an appropriate set of key postures from a human behavior video sequence and then conduct a cross entropy check to remove redundant key postures. With the key postures detected and stored as human posture templates, the degree of similarity between a query posture and a database template is evaluated using a modified Hausdorff distance measure. The experiment results show that the proposed system is highly efficient and powerful","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125476723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Integrated CBIR Scheme Based on Distance Transform","authors":"Dong Lin, Shih-Hsuan Yang","doi":"10.1109/MMSP.2005.248684","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248684","url":null,"abstract":"In this paper, we developed a new set of content-based descriptors that incorporated three primitive features of color images: chromatic, spatial, and spectral. Based on the distance transform, the proposed local relative distances pattern (LRDP) records the distance pattern of a color plane. Thus, LRDP can be regarded as an integration of chromatic and spatial information. The spectral information for the boundary and texture portion of an image is extracted by the devised frequency histogram. These two descriptors are integrated to form a robust CBIR system. Experimental results show that the proposed scheme has better retrieval accuracy than conventional CBIR methods","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132745080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video Summarization based on Film Grammar","authors":"A. Yoshitaka, Yoshiki Deguchi","doi":"10.1109/MMSP.2005.248620","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248620","url":null,"abstract":"Searching time-intrinsic contents, such as movies, drama, is often a time consuming task, since it is not satisfactory to see some of screen shots to catch the whole story. Summarizing such contents is one of the solutions to diminish the cost in browsing and grasp the contents rapidly. Most of the video summarization methods ever proposed extract `conspicuous' shots from audio/visual point of view, and combine them together to create a summary. These methods imply an issue of disregarding contextual dependency between shots or scenes. We propose a method of summary generation based on `film grammar', which keeps the dependency between shots or scenes. Experimental results show this method provides a viewer better summaries to comprehend the context","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132763514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis on Quantization Error Propagation for Motion-Compensated Lifted Wavelet Video Coding","authors":"Chuo-Ling Chang, Aditya Mavlankar, B. Girod","doi":"10.1109/MMSP.2005.248607","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248607","url":null,"abstract":"In wavelet video coding, the quantization error in the temporal sub-bands propagates through temporal wavelet synthesis to the reconstructed frames. We analyze this propagation by considering the impact of the spatial interpolation in motion compensation on the statistics of the quantization error. Based on the analysis, a distortion model that can be derived from the motion fields is introduced and incorporated into a rate allocation method aiming at reducing the quality fluctuation over frames. Comparing to the conventional allocation method that ignores the effect of motion compensation, significant reduction in quality fluctuation is observed while incurring minimal loss in the average quality","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131536174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Centralized P2P Streaming with MDC","authors":"I. Lee, Yifeng He, L. Guan","doi":"10.1109/MMSP.2005.248598","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248598","url":null,"abstract":"Peer-to-peer networking technique represents a vast potential to overcome many constraints in the conventional content distribution networks. In this paper, we propose centralized peer-to-peer (P2P) video streaming with multiple-description coding. Centralized P2P streaming with single and multiple forwarding peers are studied in this paper. We compare between the network loads at the bottleneck link of the client/server framework and that of the proposed framework. The reconstructed video qualities are evaluated in several experiments. We also analyze the frame dropout rates for the proposed system","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132920588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tempo-based MTV-Style Home Video Authoring","authors":"Shih-Hung Lee, S. Wang, C.-C. Jay Kuo","doi":"10.1109/MMSP.2005.248593","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248593","url":null,"abstract":"Automatic authoring of MTV-style home video using visual and music tempo analysis is studied in this work. In the proposed system, the input home video is first segmented into shots by low level features such as the color histogram difference. Then, three types of tempo analysis are conducted; namely, music, global visual (the frame level) and local visual (the facial expression level) tempo analysis. For local visual tempo analysis, we propose a scheme to compute the facial tension from a sequence of facial appearance variation. Finally, the authoring methodology is presented, which consists of music and visual tempo matching to product MTV-style video. Experiments are conducted using baby home video with encouraging results obtained","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114277653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}