用于HDTV1080P实时编码中可变块大小运动估计的32并行SAD树硬连线引擎

Zhenyu Liu, Yang Song, Ming Shao, Shen Li, Lingfeng Li, S. Goto, T. Ikenaga
{"title":"用于HDTV1080P实时编码中可变块大小运动估计的32并行SAD树硬连线引擎","authors":"Zhenyu Liu, Yang Song, Ming Shao, Shen Li, Lingfeng Li, S. Goto, T. Ikenaga","doi":"10.1109/SIPS.2007.4387630","DOIUrl":null,"url":null,"abstract":"H.264/AVC coding standard incorporates variable block size (VBS) motion estimation (ME) to improve the compression efficiency. For HDTV-1080p application, the massive computation and huge memory bandwidth by the large video frame size and the wide search range are two critical impediments to the real-time hardwired VB-SME engine design. In this paper, we present six techniques to circumvent these difficulties. First, the inter modes bellow 8 × 8 are eliminated in our design to reduce the hardware cost. Second, the low-pass filter based 4:1 down-sampling algorithm successfully reduces about 75% arithmetic computation in each search position. Third, the coarse to fine search scheme is made use of to reduce 25%-50% search candidates. Fourth, C+ memory organization is adopted to reduce the external IO bandwidth. Fifth, horizontal zigzag scan mode optimizes the search window memories. Finally, in circuit design, 4:2 compressor based CSA tree, multi-cycle path delay and 2 pipeline stage SAD tree techniques are utilized to improve the speed and reduce the hardware of each SAD tree. The hardwired integer motion estimation (IME) engine with 192 × 128 search range for HDTVl080p@30Hz is demonstrated in this paper. With TSMC 0.18¿m 1P6M CMOS technology, it is implemented with 485.7k gates standard cells and 327.68k bit on chip memories. The power dissipation is 729mw at 200MHz clock speed.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"28 1","pages":"675-680"},"PeriodicalIF":0.0000,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"32-Parallel SAD Tree Hardwired Engine for Variable Block Size Motion Estimation in HDTV1080P Real-Time Encoding Application\",\"authors\":\"Zhenyu Liu, Yang Song, Ming Shao, Shen Li, Lingfeng Li, S. Goto, T. Ikenaga\",\"doi\":\"10.1109/SIPS.2007.4387630\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"H.264/AVC coding standard incorporates variable block size (VBS) motion estimation (ME) to improve the compression efficiency. For HDTV-1080p application, the massive computation and huge memory bandwidth by the large video frame size and the wide search range are two critical impediments to the real-time hardwired VB-SME engine design. In this paper, we present six techniques to circumvent these difficulties. First, the inter modes bellow 8 × 8 are eliminated in our design to reduce the hardware cost. Second, the low-pass filter based 4:1 down-sampling algorithm successfully reduces about 75% arithmetic computation in each search position. Third, the coarse to fine search scheme is made use of to reduce 25%-50% search candidates. Fourth, C+ memory organization is adopted to reduce the external IO bandwidth. Fifth, horizontal zigzag scan mode optimizes the search window memories. Finally, in circuit design, 4:2 compressor based CSA tree, multi-cycle path delay and 2 pipeline stage SAD tree techniques are utilized to improve the speed and reduce the hardware of each SAD tree. The hardwired integer motion estimation (IME) engine with 192 × 128 search range for HDTVl080p@30Hz is demonstrated in this paper. With TSMC 0.18¿m 1P6M CMOS technology, it is implemented with 485.7k gates standard cells and 327.68k bit on chip memories. The power dissipation is 729mw at 200MHz clock speed.\",\"PeriodicalId\":93225,\"journal\":{\"name\":\"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)\",\"volume\":\"28 1\",\"pages\":\"675-680\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-11-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIPS.2007.4387630\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIPS.2007.4387630","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29

摘要

H.264/AVC编码标准引入了可变块大小(VBS)运动估计(ME),提高了压缩效率。在HDTV-1080p应用中,大视频帧大小和大搜索范围所带来的庞大计算量和巨大的内存带宽是实时硬连线VB-SME引擎设计的两个关键障碍。在本文中,我们提出了六种技术来克服这些困难。首先,我们的设计消除了8 × 8以下的交互模式,以降低硬件成本。其次,基于低通滤波器的4:1降采样算法成功地减少了每个搜索位置约75%的算术计算。第三,采用粗到细的搜索方案,将候选搜索对象减少25% ~ 50%。第四,采用c++内存组织,减少外部IO带宽。第五,水平之字形扫描模式优化了搜索窗口存储器。最后,在电路设计中,采用了基于4:2压缩机的CSA树、多周期路径延迟和2管道级SAD树技术,提高了速度,减少了每个SAD树的硬件。本文演示了一种192 × 128搜索范围的硬连线整数运动估计引擎(IME)。采用台积电0.18¿m 1P6M CMOS技术,采用485.7k栅极标准单元和327.68k位片上存储器。在200MHz时钟速度下,功耗为729mw。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
32-Parallel SAD Tree Hardwired Engine for Variable Block Size Motion Estimation in HDTV1080P Real-Time Encoding Application
H.264/AVC coding standard incorporates variable block size (VBS) motion estimation (ME) to improve the compression efficiency. For HDTV-1080p application, the massive computation and huge memory bandwidth by the large video frame size and the wide search range are two critical impediments to the real-time hardwired VB-SME engine design. In this paper, we present six techniques to circumvent these difficulties. First, the inter modes bellow 8 × 8 are eliminated in our design to reduce the hardware cost. Second, the low-pass filter based 4:1 down-sampling algorithm successfully reduces about 75% arithmetic computation in each search position. Third, the coarse to fine search scheme is made use of to reduce 25%-50% search candidates. Fourth, C+ memory organization is adopted to reduce the external IO bandwidth. Fifth, horizontal zigzag scan mode optimizes the search window memories. Finally, in circuit design, 4:2 compressor based CSA tree, multi-cycle path delay and 2 pipeline stage SAD tree techniques are utilized to improve the speed and reduce the hardware of each SAD tree. The hardwired integer motion estimation (IME) engine with 192 × 128 search range for HDTVl080p@30Hz is demonstrated in this paper. With TSMC 0.18¿m 1P6M CMOS technology, it is implemented with 485.7k gates standard cells and 327.68k bit on chip memories. The power dissipation is 729mw at 200MHz clock speed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信