{"title":"Communication-friendly encryption of multimedia","authors":"Min Wu, Yinian Mao","doi":"10.1109/MMSP.2002.1203303","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203303","url":null,"abstract":"This paper discusses encryption operations that selectively encrypt content-carrying segments of multimedia data stream. We propose and analyze three techniques that work in different domains, namely, a syntax-aware selective bitstream encryption tool with bit stuffing, a generalized index mapping encryption tool with controlled overhead and an intra-bitplane encryption tool compatible with fine granularity scalable coding. The designs of these proposed encryption operations take into consideration the inherent structure and syntax of multimedia sources and have improved friendliness to communications, compression and computation.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133611919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Embedded image compression using DCT based subband decomposition and SLCCA data organization","authors":"Junqiang Lan, X. Zhuang","doi":"10.1109/MMSP.2002.1203253","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203253","url":null,"abstract":"Wavelet transform provides harmonic space-frequency localization and great energy compaction, but with generally high computational complexity. In this paper, an 8/spl times/8 fast discrete cosine transform (DCT) approach is adopted to perform subband decomposition, followed by SLCCA data organization and entropy coding. Simulation results showed that the embedded DCT-SLCCA image compression reduced the computational complexity to only a quarter of the wavelet based subband decomposition while the average peak signal-to-noise ratio (PSNR) degraded at the same bit rate by only 0.6 dB. The subjective image quality degradation was almost unnoticeable. Moreover, due to 8/spl times/8 fast DCT hardware implementation being commercially available, the proposed DCT-SLCCA has the potential for high performance high speed image coding and transmission.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115684775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Characterization of abrupt/gradual video shot transitions as unsmoothed/smoothed singularity","authors":"Chun-Shien Lu","doi":"10.1109/MMSP.2002.1203282","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203282","url":null,"abstract":"The multitude of high-level video operations demands sophisticated low-level video techniques. Detection (or segmentation) of video shot transitions (or boundaries) is one of the crucial low-level operations towards automatic video indexing, video editing, video abstracting or preview, and so on. Since content-based multimedia processing has been the focus of MPEG7, we shall develop a scheme to precisely characterize video shot transitions for further applications. Varieties of shot transition detection techniques have been reported in literature. However, no work was really done by taking the benefit of analyzing shot transitions from the local singularities of the signal. In this paper, we propose to characterize shot transitions from their time-frequency variations. First, the color differences between every two neighboring frames are collected as a discontinuity sequence. Second, the time frequency analysis of 1D continuous wavelet transform is employed to locate and interpret the local singularities of the resultant discontinuity sequence. Results reveal that abrupt transition could be characterized as an unsmoothed singularity while gradual transition could be characterized as a smoothed singularity.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116452820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The distributed Karhunen-Loeve transform","authors":"Michael Gastpar, P. Dragotti, M. Vetterli","doi":"10.1109/MMSP.2002.1203247","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203247","url":null,"abstract":"The Karhunen-Loeve transform is a key element of many signal processing tasks, including classification and compression. In this paper, we consider distributed signal processing scenarios with limited communication between correlated sources, and we investigate a distributed Karhunen-Loeve transform (KLT). In particular, a partial (where only a subset of sources are observed) and a conditional KLT (where some sources act as side information) are posed and solved in a rate-distortion sense. The partial KLT leads to an original bit allocation problem, while the conditional KLT leads to a Wyner-Ziv solution which is separable at the sources. These two cases can be seen as extreme cases of a distributed KLT.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"31 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124603369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Medium access control with channel state information for large sensor networks","authors":"S. Adireddy, L. Tong","doi":"10.1109/MMSP.2002.1203334","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203334","url":null,"abstract":"Traditionally, random access protocols have been designed and studied by assuming simple models for the physical layer. We introduce a reception model that incorporates the channel states of the transmitting users and allows for multiple simultaneous successes. We assume that each user has access to his channel state and propose a variant of the Slotted ALOHA protocol for medium access where the transmit probability is chosen as a function of the channel state. We introduce the notion of asymptotic stable throughput and characterize the achievable asymptotic stable throughput through the use of channel state information. As an example, we consider the application of the results to sensor networks.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124634826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Cacopardi, M. Caponi, F. Frescura, Simone Sabina
{"title":"A DSP based MPEG-2 video decoder for HDTV or multichannel SDTV","authors":"S. Cacopardi, M. Caponi, F. Frescura, Simone Sabina","doi":"10.1109/MMSP.2002.1203266","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203266","url":null,"abstract":"This paper describes a DSP based MPEG-2 video decoding software. The proposed decoder is able to reconstruct with full quality, in real-time, a sequence in HDTV format (corresponding to a subset of the MP@HL configuration) or up to three SDTV sequences (corresponding to the MP@ML configuration). The developed implementation is based on single DSP, reducing the cost and enabling an easy upgrading of the system with new capabilities. The chosen device is the high performance TMS320C64x DSP by Texas Instruments. The target applications are the transcoding devices, video servers and all professional digital video equipments.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126851039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The latest achievement of VC project for automatic video caption generation","authors":"Takahiro Suzuki, T. Kitazume, M. Sugiyama","doi":"10.1109/MMSP.2002.1203344","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203344","url":null,"abstract":"This paper describes the progress of our automatic video caption generation project. VC project has developed video caption markup language and its player (VCML and VCML player) to reduce labor and cost of making captions. VCML player, which displays video with its caption, has new functions. One is displaying auditory scene symbol, and another is tree-structured VCML files. Voice-pause method, which was originally developed to align voice intervals and their corresponding written text, has been improved sound data containing both voice and music intervals. The results of the alignment experiment show that the improved method, voice-music-pause method, can align all voice, music and pause intervals effectively.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"163 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127420415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Increasing the capacity of LSB-based audio steganography","authors":"N. Cvejic, T. Seppänen","doi":"10.1109/MMSP.2002.1203314","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203314","url":null,"abstract":"Conventionally, a perceptual limit of three bits per sample is imposed to the basic LSB audio steganography method. In this paper, we present a novel modification to standard LSB algorithm that is able to embed four bits per sample, thus improving the capacity of data hiding channel by 33%. The proposed algorithm makes use of minimum error replacement method for LSB adjustment and modified error diffusion method for decreasing SNR value. Objective test showed the algorithm succeeds in this task, while keeping SNR value close to the level of SNR obtained by standard LSB embedding with three bits per sample capacity. Subjective listening test proved that high perceptual transparency is accomplished even if four LSBs of host audio signal are used for data hiding.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123128728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An efficient database search strategy for audio fingerprinting","authors":"J. Haitsma, T. Kalker, J. Oostveen","doi":"10.1109/MMSP.2002.1203276","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203276","url":null,"abstract":"In this paper we present a highly efficient audio fingerprinting system. At the core of the presented system are a highly robust fingerprint extraction method an a very efficient fingerprint search strategy, which enable searching a large fingerprint database eith only limited computing resources. We describe the main principles of the method, as well as a model for false acceptance rates.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"11 5/6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114127014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-repudiation oblivious watermarking schema for secure digital video distribution","authors":"W. Zhou, T. Rockwood, P. Sagetong","doi":"10.1109/MMSP.2002.1203316","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203316","url":null,"abstract":"This paper presents a mechanism and algorithm for creating undeniable watermarks. It assumes a system where a content owner or provider uses outside agents to distribute its content. Content watermarked by distribution agents using this system will be undeniably recognizable by the content provider as originating with that distribution agent. That is to say that given N distribution agents, the content provider will be able to tell which distribution agent watermarked the content. The system does not allow any distribution agent to watermark content that would appear to have been watermarked by another agent and it does also not allow the content provider to watermark content that would appear to have been watermarked by a particular distribution agent. This allows the content provider to place a high degree of trust in the identification of the distribution agent and trace \"leak\" locations of pirated copies of videos.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114755931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}