{"title":"Transform coding of arbitrarily-shaped image segments","authors":"Shih-Fu Chang, D. Messerschmitt","doi":"10.1145/166266.166275","DOIUrl":null,"url":null,"abstract":"Envisioned advanced multimedia video services include both rectangular and arbitrarily-shaped image segments. Image segments of the TV weather reporter produced by the chromo-key technique and image segments produced by video segmentation or image editing are typical examples. In this paper, we investigate efficient transform coding techniques of arbitrarily-shaped image segments. We formulate the optimal representation problem in two different domains — the full rectangular domain and the shape-projected domain. In the former, we still use the traditional rectangular transform coding method (e.g. DCT) but try to find optimal pixel values outside the segment boundary in order to make the transform spectrum as compact as possible. A simple but efficient mirror-image extension technique is proposed. In the shape-projected domain, we project the image segment and all basis functions into the subspace spanned over the image region only. Existing coding algorithms, such as orthogonal transform by Gilge [1] and iterative coding by Kaup and Aach [2], can be intuitively interpreted. To demonstrate the flexibility of the proposed formulation, we also derive a new KLT-like algorithm in the shape-projected domain. We analyze tradeoff between compression performance, computational complexity, and codec complexity for different coding schemes. Simulation results show that complicated algorithms (e.g. iterative, adaptive) can improve the quality by about 5-10 dB at some computational or hardware cost. On the other hand, the proposed simple mirror-image extension technique improves the quality by about 3-4 dB without any overheads. The contributions of this paper lie in efficient problem formulation, new transform coding techniques, and numerical tradeoff analyses. Currently, we are implementing a software program for AS image object editing and manipulation .","PeriodicalId":412458,"journal":{"name":"MULTIMEDIA '93","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MULTIMEDIA '93","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/166266.166275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 37
Abstract
Envisioned advanced multimedia video services include both rectangular and arbitrarily-shaped image segments. Image segments of the TV weather reporter produced by the chromo-key technique and image segments produced by video segmentation or image editing are typical examples. In this paper, we investigate efficient transform coding techniques of arbitrarily-shaped image segments. We formulate the optimal representation problem in two different domains — the full rectangular domain and the shape-projected domain. In the former, we still use the traditional rectangular transform coding method (e.g. DCT) but try to find optimal pixel values outside the segment boundary in order to make the transform spectrum as compact as possible. A simple but efficient mirror-image extension technique is proposed. In the shape-projected domain, we project the image segment and all basis functions into the subspace spanned over the image region only. Existing coding algorithms, such as orthogonal transform by Gilge [1] and iterative coding by Kaup and Aach [2], can be intuitively interpreted. To demonstrate the flexibility of the proposed formulation, we also derive a new KLT-like algorithm in the shape-projected domain. We analyze tradeoff between compression performance, computational complexity, and codec complexity for different coding schemes. Simulation results show that complicated algorithms (e.g. iterative, adaptive) can improve the quality by about 5-10 dB at some computational or hardware cost. On the other hand, the proposed simple mirror-image extension technique improves the quality by about 3-4 dB without any overheads. The contributions of this paper lie in efficient problem formulation, new transform coding techniques, and numerical tradeoff analyses. Currently, we are implementing a software program for AS image object editing and manipulation .