{"title":"高效并行HEVC编码的混合刮擦板和缓存存储器管理","authors":"Changlai Song, Lei Ju, Zhiping Jia","doi":"10.1109/ICCD.2015.7357185","DOIUrl":null,"url":null,"abstract":"The next-generation video coding standard High Efficiency Video Coding (HEVC) provides better compression rates for high resolution videos compared with H.264, at the cost of significantly increased needs for computation power and memory bandwidth. Therefore, memory subsystem optimization is of paramount importance to support HEVC on resource and energy constrained embedded consumer electronics. In this paper, we present a hybrid on-chip memory architecture with both caches and scratchpad memories (SPMs) for parallel HEVC encoding. A run-time prediction algorithm is proposed to effectively identify the most-frequently accessed memory regions in the search window(s) for processing individual coding tree units (CTUs). Depending on their intra- and inter-core reuses, these regions are loaded into the private or shared SPMs for guaranteed on-chip memory accesses. On the other hand, a relatively small hardware-controlled cache is used for the rest of data accesses. Moreover, an adaptive power gating scheme is proposed to power off SPM sectors with expired load windows to further reduce the on-chip leakage power. Compared with the state-of-the-art solution, experimental results show that our proposed memory management framework supports high speed parallel HEVC processing with substantially smaller on-chip memory size, which achieves up to 76.23% on-chip leakage energy savings, and 33.31% energy saving for the overall memory subsystem.","PeriodicalId":129506,"journal":{"name":"2015 33rd IEEE International Conference on Computer Design (ICCD)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Hybrid scratchpad and cache memory management for energy-efficient parallel HEVC encoding\",\"authors\":\"Changlai Song, Lei Ju, Zhiping Jia\",\"doi\":\"10.1109/ICCD.2015.7357185\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The next-generation video coding standard High Efficiency Video Coding (HEVC) provides better compression rates for high resolution videos compared with H.264, at the cost of significantly increased needs for computation power and memory bandwidth. Therefore, memory subsystem optimization is of paramount importance to support HEVC on resource and energy constrained embedded consumer electronics. In this paper, we present a hybrid on-chip memory architecture with both caches and scratchpad memories (SPMs) for parallel HEVC encoding. A run-time prediction algorithm is proposed to effectively identify the most-frequently accessed memory regions in the search window(s) for processing individual coding tree units (CTUs). Depending on their intra- and inter-core reuses, these regions are loaded into the private or shared SPMs for guaranteed on-chip memory accesses. On the other hand, a relatively small hardware-controlled cache is used for the rest of data accesses. Moreover, an adaptive power gating scheme is proposed to power off SPM sectors with expired load windows to further reduce the on-chip leakage power. Compared with the state-of-the-art solution, experimental results show that our proposed memory management framework supports high speed parallel HEVC processing with substantially smaller on-chip memory size, which achieves up to 76.23% on-chip leakage energy savings, and 33.31% energy saving for the overall memory subsystem.\",\"PeriodicalId\":129506,\"journal\":{\"name\":\"2015 33rd IEEE International Conference on Computer Design (ICCD)\",\"volume\":\"122 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 33rd IEEE International Conference on Computer Design (ICCD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCD.2015.7357185\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 33rd IEEE International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2015.7357185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
摘要
与H.264相比,新一代视频编码标准高效视频编码(High Efficiency video coding, HEVC)为高分辨率视频提供了更好的压缩率,但代价是对计算能力和内存带宽的需求显著增加。因此,内存子系统优化对于支持资源和能源受限的嵌入式消费电子产品的HEVC至关重要。在本文中,我们提出了一种混合片上存储器架构,其中包含高速缓存和刮刮板存储器(spm),用于并行HEVC编码。提出了一种运行时预测算法,以有效地识别搜索窗口中访问频率最高的存储区域,用于处理单个编码树单元。根据它们在核内和核间的重用,这些区域被加载到私有或共享spm中,以保证片上内存访问。另一方面,其余的数据访问使用相对较小的硬件控制缓存。此外,本文还提出了一种自适应功率门控方案,对负载窗过期的SPM扇区进行断电处理,进一步降低片上漏功率。实验结果表明,我们提出的内存管理框架支持高速并行HEVC处理,且片上存储器尺寸大大减小,片上泄漏节能高达76.23%,整个存储器子系统节能33.31%。
Hybrid scratchpad and cache memory management for energy-efficient parallel HEVC encoding
The next-generation video coding standard High Efficiency Video Coding (HEVC) provides better compression rates for high resolution videos compared with H.264, at the cost of significantly increased needs for computation power and memory bandwidth. Therefore, memory subsystem optimization is of paramount importance to support HEVC on resource and energy constrained embedded consumer electronics. In this paper, we present a hybrid on-chip memory architecture with both caches and scratchpad memories (SPMs) for parallel HEVC encoding. A run-time prediction algorithm is proposed to effectively identify the most-frequently accessed memory regions in the search window(s) for processing individual coding tree units (CTUs). Depending on their intra- and inter-core reuses, these regions are loaded into the private or shared SPMs for guaranteed on-chip memory accesses. On the other hand, a relatively small hardware-controlled cache is used for the rest of data accesses. Moreover, an adaptive power gating scheme is proposed to power off SPM sectors with expired load windows to further reduce the on-chip leakage power. Compared with the state-of-the-art solution, experimental results show that our proposed memory management framework supports high speed parallel HEVC processing with substantially smaller on-chip memory size, which achieves up to 76.23% on-chip leakage energy savings, and 33.31% energy saving for the overall memory subsystem.