变换编码的任意形状的图像段

Shih-Fu Chang, D. Messerschmitt
{"title":"变换编码的任意形状的图像段","authors":"Shih-Fu Chang, D. Messerschmitt","doi":"10.1145/166266.166275","DOIUrl":null,"url":null,"abstract":"Envisioned advanced multimedia video services include both rectangular and arbitrarily-shaped image segments. Image segments of the TV weather reporter produced by the chromo-key technique and image segments produced by video segmentation or image editing are typical examples. In this paper, we investigate efficient transform coding techniques of arbitrarily-shaped image segments. We formulate the optimal representation problem in two different domains — the full rectangular domain and the shape-projected domain. In the former, we still use the traditional rectangular transform coding method (e.g. DCT) but try to find optimal pixel values outside the segment boundary in order to make the transform spectrum as compact as possible. A simple but efficient mirror-image extension technique is proposed. In the shape-projected domain, we project the image segment and all basis functions into the subspace spanned over the image region only. Existing coding algorithms, such as orthogonal transform by Gilge [1] and iterative coding by Kaup and Aach [2], can be intuitively interpreted. To demonstrate the flexibility of the proposed formulation, we also derive a new KLT-like algorithm in the shape-projected domain. We analyze tradeoff between compression performance, computational complexity, and codec complexity for different coding schemes. Simulation results show that complicated algorithms (e.g. iterative, adaptive) can improve the quality by about 5-10 dB at some computational or hardware cost. On the other hand, the proposed simple mirror-image extension technique improves the quality by about 3-4 dB without any overheads. The contributions of this paper lie in efficient problem formulation, new transform coding techniques, and numerical tradeoff analyses. Currently, we are implementing a software program for AS image object editing and manipulation .","PeriodicalId":412458,"journal":{"name":"MULTIMEDIA '93","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":"{\"title\":\"Transform coding of arbitrarily-shaped image segments\",\"authors\":\"Shih-Fu Chang, D. Messerschmitt\",\"doi\":\"10.1145/166266.166275\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Envisioned advanced multimedia video services include both rectangular and arbitrarily-shaped image segments. Image segments of the TV weather reporter produced by the chromo-key technique and image segments produced by video segmentation or image editing are typical examples. In this paper, we investigate efficient transform coding techniques of arbitrarily-shaped image segments. We formulate the optimal representation problem in two different domains — the full rectangular domain and the shape-projected domain. In the former, we still use the traditional rectangular transform coding method (e.g. DCT) but try to find optimal pixel values outside the segment boundary in order to make the transform spectrum as compact as possible. A simple but efficient mirror-image extension technique is proposed. In the shape-projected domain, we project the image segment and all basis functions into the subspace spanned over the image region only. Existing coding algorithms, such as orthogonal transform by Gilge [1] and iterative coding by Kaup and Aach [2], can be intuitively interpreted. To demonstrate the flexibility of the proposed formulation, we also derive a new KLT-like algorithm in the shape-projected domain. We analyze tradeoff between compression performance, computational complexity, and codec complexity for different coding schemes. Simulation results show that complicated algorithms (e.g. iterative, adaptive) can improve the quality by about 5-10 dB at some computational or hardware cost. On the other hand, the proposed simple mirror-image extension technique improves the quality by about 3-4 dB without any overheads. The contributions of this paper lie in efficient problem formulation, new transform coding techniques, and numerical tradeoff analyses. Currently, we are implementing a software program for AS image object editing and manipulation .\",\"PeriodicalId\":412458,\"journal\":{\"name\":\"MULTIMEDIA '93\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1993-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"37\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MULTIMEDIA '93\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/166266.166275\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MULTIMEDIA '93","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/166266.166275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 37

摘要

设想的先进多媒体视频服务包括矩形和任意形状的图像段。用色键技术制作的电视天气播报图像片段和用视频分割或图像编辑制作的图像片段是典型的例子。本文研究了任意形状图像片段的有效变换编码技术。我们在两个不同的域-全矩形域和形状投影域中提出了最优表示问题。在前者中,我们仍然使用传统的矩形变换编码方法(如DCT),但试图在段边界外寻找最优像素值,以使变换谱尽可能紧凑。提出了一种简单而有效的镜像扩展技术。在形状投影域中,我们将图像段和所有基函数投影到仅在图像区域上张成的子空间中。现有的编码算法,如Gilge的正交变换[1],Kaup和Aach的迭代编码[2],都可以直观地解释。为了证明所提出的公式的灵活性,我们还在形状投影域中推导了一个新的类似klt的算法。我们分析了不同编码方案在压缩性能、计算复杂度和编解码器复杂度之间的权衡。仿真结果表明,复杂的算法(如迭代、自适应)可以在一定的计算或硬件代价下提高约5-10 dB的质量。另一方面,所提出的简单镜像扩展技术在没有任何开销的情况下将质量提高了约3-4 dB。本文的贡献在于有效的问题表述、新的变换编码技术和数值权衡分析。目前,我们正在实现一个用于AS图像对象编辑和处理的软件程序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Transform coding of arbitrarily-shaped image segments
Envisioned advanced multimedia video services include both rectangular and arbitrarily-shaped image segments. Image segments of the TV weather reporter produced by the chromo-key technique and image segments produced by video segmentation or image editing are typical examples. In this paper, we investigate efficient transform coding techniques of arbitrarily-shaped image segments. We formulate the optimal representation problem in two different domains — the full rectangular domain and the shape-projected domain. In the former, we still use the traditional rectangular transform coding method (e.g. DCT) but try to find optimal pixel values outside the segment boundary in order to make the transform spectrum as compact as possible. A simple but efficient mirror-image extension technique is proposed. In the shape-projected domain, we project the image segment and all basis functions into the subspace spanned over the image region only. Existing coding algorithms, such as orthogonal transform by Gilge [1] and iterative coding by Kaup and Aach [2], can be intuitively interpreted. To demonstrate the flexibility of the proposed formulation, we also derive a new KLT-like algorithm in the shape-projected domain. We analyze tradeoff between compression performance, computational complexity, and codec complexity for different coding schemes. Simulation results show that complicated algorithms (e.g. iterative, adaptive) can improve the quality by about 5-10 dB at some computational or hardware cost. On the other hand, the proposed simple mirror-image extension technique improves the quality by about 3-4 dB without any overheads. The contributions of this paper lie in efficient problem formulation, new transform coding techniques, and numerical tradeoff analyses. Currently, we are implementing a software program for AS image object editing and manipulation .
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信