D. Besiris, N. Laskaris, F. Fotopoulou, G. Economou
{"title":"Key frame extraction in video sequences: a vantage points approach","authors":"D. Besiris, N. Laskaris, F. Fotopoulou, G. Economou","doi":"10.1109/MMSP.2007.4412909","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412909","url":null,"abstract":"In this work, the idea of key frames extraction from single shots in video sequences is presented. The method is implemented by an efficient two-step algorithm, which is classified neither to clustering nor to temporal variations based techniques. In the first step, an MST (minimal spanning tree) graph is constructed, where each node is associated to a single frame of the shot. In the second step, extracts key frames based on the principle of their maximum spread, are extracted. The number of the selected key frames is controlled by an adaptively defined threshold, while the validity of the results is evaluated by the fidelity measure.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115498068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image Authentication and Tampering Localization using Distributed Source Coding","authors":"Y. Lin, D. Varodayan, B. Girod","doi":"10.1109/MMSP.2007.4412899","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412899","url":null,"abstract":"Media authentication is important in content delivery via untrusted intermediaries, such as peer-to-peer (P2P) file sharing. Many differently encoded versions of a media file might exist. Our previous work applied distributed source coding to distinguish the legitimate diversity of encoded images from tampering. An authentication decoder was supplied with a Slepian-Wolf encoded lossy version of the image as authentication data. Distributed source coding provided the desired robustness against legitimate encoding variations, while detecting illegitimate modification. We augment the decoder to localize tampering in an image already deemed to be unauthentic. The localization decoder requires only incremental localization data beyond the authentication data since we use rate-adaptive distributed source codes. Both decoders perform joint bitplane decoding, rather than conditional bitplane decoding. Our results demonstrate that tampered image blocks can be identified with high probability using authentication plus localization data of only a few hundred bytes for a 512times512 image.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122711921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimedia Technologies and Solutions for Educational Applications: Opportunities, Trends and Challenges","authors":"Patti Price","doi":"10.1109/MMSP.2007.4412804","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412804","url":null,"abstract":"This report aims to provide an overview of multimedia technologies in education, particularly language technologies, and particularly applications aimed at children. Emphasis is on those aspects that may not necessarily be familiar to engineers, such as linguistics and language pedagogy.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125367848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Intra Prediction for H.264/AVC Scalable Extension","authors":"Yang Liu, G. Rath, C. Guillemot","doi":"10.1109/MMSP.2007.4412864","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412864","url":null,"abstract":"This paper presents an improved intra prediction scheme for H.264/AVC scalable extension. First, a more flexible intra prediction is realized by introducing a sub-macroblock (sub-MB) level inter-layer intra prediction. Second, the upsampled reconstructed base layer helps to reduce the number of candidate modes for the spatial intra prediction. Finally, the neighboring pixels of the reconstructed enhancement layer are used to improve the quality of the reference from the base layer by a simple cross boundary Alter. Our scheme improves the PSNR up to 0.53 dB when compared with the JSVM standard implementation for all intra coding.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128283419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal Reconstruction in Wyner-Ziv Video Coding with Multiple Side Information","authors":"D. Kubasov, J. Nayak, C. Guillemot","doi":"10.1109/MMSP.2007.4412848","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412848","url":null,"abstract":"This paper addresses the problem of optimal minimum mean-squared error reconstruction of quantised samples in Wyner-Ziv video coding systems. Closed-form expressions of the optimal reconstructed values are derived for a Laplacian correlation model. The method is used for both single and multiple side information scenarios (the latter is also referred to as multi-hypothesis Wyner-Ziv decoding). The efficiency of the proposed optimal reconstruction method is confirmed by rate-distortion performance results, showing significant decrease of the distortion of the decoded sequence, compared to simple reconstruction methods that have been employed so far.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129944970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Argyropoulos, N. Thomos, N. Boulgouris, M. Strintzis
{"title":"Adaptive Frame Interpolation for Wyner-Ziv Video Coding","authors":"S. Argyropoulos, N. Thomos, N. Boulgouris, M. Strintzis","doi":"10.1109/MMSP.2007.4412842","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412842","url":null,"abstract":"This paper addresses the problem of frame interpolation for Wyner-Ziv video coding. A novel frame interpolation method based on block-adaptive matching algorithm for motion estimation is presented. This scheme enables block size adaptation to local activity within frames using block merging and splitting techniques. The efficiency of the proposed method is evaluated in transform domain Wyner-Ziv video coding. The experimental results demonstrate the superiority of the proposed method over existing frame interpolation techniques.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126496910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Source-Channel Coding for the Scalable Extension of H.264/MPEG-4 AVC","authors":"M. Stoufs, A. Munteanu, J. Cornelis, P. Schelkens","doi":"10.1109/MMSP.2007.4412833","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412833","url":null,"abstract":"In this paper, we propose a joint source-channel coding (JSCC) methodology which minimizes the end-to-end distortion for the transmission of H.264/MPEG-4 scalable video over packet loss channels. The proposed JSCC-approach employs low-density parity-check codes in order to provide channel protection and relies on Lagrangian-based optimization techniques to derive the appropriate protection levels for each layer produced by the scalable source codec. Experiments show that our JSCC methodology delivers competitive results to state-of-the-art Lagrangian-based algorithms. However, in contrast to the state-of-the-art, our approach significantly reduces the computational complexity. We conclude that the proposed JSCC methodology provides optimized resilience against transmission errors in scalable video streaming over variable-bandwidth error-prone channels.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131178949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Itoh, Akira Iwabuchi, K. Kojima, M. Ishigame, Kazuyo Tanaka, Shi-wook Lee
{"title":"Music Boundary Detection Using Similarity in a Music Selection","authors":"Y. Itoh, Akira Iwabuchi, K. Kojima, M. Ishigame, Kazuyo Tanaka, Shi-wook Lee","doi":"10.1109/MMSP.2007.4412898","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412898","url":null,"abstract":"This paper proposes a new method of extracting music boundaries, such as a boundary between musical selections, or a boundary between a musical selection and a speech, for automatic segmentation of ideo data and other applications. The method utilizes acoustic similarity in a music selection. Similar partial sections are first extracted, by means of a new algorithm called Segmental Continuous Dynamic Programming, or Segmental CDP. The music boundary is identified by reference to multiple similar sections and their location information, as extracted by Segmental CDP. The performance of the proposed method is evaluated for music boundary extraction using actual music data sets. The study demonstrates that the proposed method enables to extract music boundaries well for both evaluation data and a real broadcasted music program.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134325825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-Time Memory-Efficient Video Object Segmentation in Dynamic Background with Multi-Background Registration Technique","authors":"W. Chan, Shao-Yi Chien","doi":"10.1109/MMSP.2007.4412857","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412857","url":null,"abstract":"Background subtraction video segmentation is the important first step for video surveillance applications with fixed camera. There are many existing methods in the literature. However, most of them are either too simple to handle complex environment, such as dynamic background, or too complex to be executed in real-time. Based on the proposed multi-background registration technique, this paper presents a real-time video object segmentation algorithm. The proposed algorithm can better handle non-static background cases compared with original single background registration segmentation. Compared with other works that can handle dynamic background cases, it is efficient in memory usage.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"53-54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131004560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast Searching For The Optimal Area Of TFV Representation","authors":"D. Zhong","doi":"10.1109/MMSP.2007.4412859","DOIUrl":"https://doi.org/10.1109/MMSP.2007.4412859","url":null,"abstract":"Visual images are often characterized by the distribution of certain key features. Taking the face image as an example, the eye, nose and mouth are often regarded as characterizing features for recognizing face image. We call these aspects structural and statistical information of visual images and aim for developing framework for the unified description of them. We extracts certain features from randomly chosen subareas, these features have good capability to represent the local texture information. We show our retrieval results over the public face database. We found that certain subareas can provide quite good retrieval results, but the thorough searching for such subareas are time-consuming. We further developed a simple fast searching method which can large simplifies the searching process, while in the same time preserve the good performance.","PeriodicalId":225295,"journal":{"name":"2007 IEEE 9th Workshop on Multimedia Signal Processing","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124300645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}