{"title":"H.264中传播部分SAD和SAD树运动估计硬线引擎的优化","authors":"Zhenyu Liu, S. Goto, T. Ikenaga","doi":"10.1109/ICCD.2008.4751881","DOIUrl":null,"url":null,"abstract":"Variable block size motion estimation algorithm is the effcient approach to reduce the temporal redundancies and it has been adopted by the latest video coding standard H.264/AVC. The computational complexity augment coming from the variable block size technique makes the hardwired accelerator essential, especially for real-time applications. In this paper, the authors apply the architecture level and the circuits level approaches to improve the performance of Propagate Partial SAD and SAD Tree hardwired engines, which outperform other counterparts when considering the impact of supporting the variable block size technique. Experiments demonstrate that by using the proposed approaches, compared with the original architectures, 14.7% and 18.0% hardware cost can be saved for Propagate Partial SAD architecture and SAD Tree architecture, respectively. With TSMC 0.18 mm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture attains 231.6 MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the execution speed of the optimized SAD Tree architecture is improved to 204.8 MHz with 88.5 k gate hardware overhead.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Optimization of Propagate Partial SAD and SAD tree motion estimation hardwired engine for H.264\",\"authors\":\"Zhenyu Liu, S. Goto, T. Ikenaga\",\"doi\":\"10.1109/ICCD.2008.4751881\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Variable block size motion estimation algorithm is the effcient approach to reduce the temporal redundancies and it has been adopted by the latest video coding standard H.264/AVC. The computational complexity augment coming from the variable block size technique makes the hardwired accelerator essential, especially for real-time applications. In this paper, the authors apply the architecture level and the circuits level approaches to improve the performance of Propagate Partial SAD and SAD Tree hardwired engines, which outperform other counterparts when considering the impact of supporting the variable block size technique. Experiments demonstrate that by using the proposed approaches, compared with the original architectures, 14.7% and 18.0% hardware cost can be saved for Propagate Partial SAD architecture and SAD Tree architecture, respectively. With TSMC 0.18 mm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture attains 231.6 MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the execution speed of the optimized SAD Tree architecture is improved to 204.8 MHz with 88.5 k gate hardware overhead.\",\"PeriodicalId\":345501,\"journal\":{\"name\":\"2008 IEEE International Conference on Computer Design\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Conference on Computer Design\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCD.2008.4751881\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Computer Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2008.4751881","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
摘要
变块大小运动估计算法是减少时间冗余的有效方法,已被最新的视频编码标准H.264/AVC所采用。可变块大小技术带来的计算复杂性的增加使得硬连线加速器变得必不可少,特别是在实时应用中。在本文中,作者采用体系结构级和电路级的方法来提高传播部分SAD和SAD树硬连线引擎的性能,在考虑支持可变块大小技术的影响时,它们优于其他同类引擎。实验表明,采用本文提出的方法,与原有结构相比,可分别节省14.7%和18.0%的硬件成本。采用台积电0.18 mm 1P6M CMOS技术,所提出的Propagate Partial SAD架构以84.1 k栅极成本达到231.6 MHz的工作频率。相应地,优化后的SAD树架构的执行速度提高到204.8 MHz,栅极硬件开销为88.5 k。
Optimization of Propagate Partial SAD and SAD tree motion estimation hardwired engine for H.264
Variable block size motion estimation algorithm is the effcient approach to reduce the temporal redundancies and it has been adopted by the latest video coding standard H.264/AVC. The computational complexity augment coming from the variable block size technique makes the hardwired accelerator essential, especially for real-time applications. In this paper, the authors apply the architecture level and the circuits level approaches to improve the performance of Propagate Partial SAD and SAD Tree hardwired engines, which outperform other counterparts when considering the impact of supporting the variable block size technique. Experiments demonstrate that by using the proposed approaches, compared with the original architectures, 14.7% and 18.0% hardware cost can be saved for Propagate Partial SAD architecture and SAD Tree architecture, respectively. With TSMC 0.18 mm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture attains 231.6 MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the execution speed of the optimized SAD Tree architecture is improved to 204.8 MHz with 88.5 k gate hardware overhead.