{"title":"Enabling large decoded instruction loop caching for energy-aware embedded processors","authors":"Ji Gu, Hui Guo","doi":"10.1145/1878921.1878957","DOIUrl":null,"url":null,"abstract":"Low energy consumption in embedded processors is increasingly important in step with the system complexity. The on-chip instruction cache (I-cache) is usually a most energy consuming component on the processor chip due to its large size and frequent access operations. To reduce such energy consumption, the existing loop cache approaches use a tiny decoded cache to filter the I-cache access and instruction decode activity for repeated loop iterations. However, such designs are effective to small and simple loops, and only suitable for DSP kernel-like applications. They are not effectual to many embedded applications where complex loops are common.\n In this paper, we propose a decoded loop instruction cache (DLIC) that is small, hence energy efficient, yet can capture most loops, including large, nested ones with branch executions, so that a significant amount of I-cache accesses and instruction decoding can be eradicated. Experiments on a set of embedded benchmarks show that our proposed DLIC scheme can reduce energy consumption by up to 87%. On average, 66% energy can be saved on instruction fetching and decoding, at a performance overhead of only 1.4%.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1878921.1878957","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Low energy consumption in embedded processors is increasingly important in step with the system complexity. The on-chip instruction cache (I-cache) is usually a most energy consuming component on the processor chip due to its large size and frequent access operations. To reduce such energy consumption, the existing loop cache approaches use a tiny decoded cache to filter the I-cache access and instruction decode activity for repeated loop iterations. However, such designs are effective to small and simple loops, and only suitable for DSP kernel-like applications. They are not effectual to many embedded applications where complex loops are common.
In this paper, we propose a decoded loop instruction cache (DLIC) that is small, hence energy efficient, yet can capture most loops, including large, nested ones with branch executions, so that a significant amount of I-cache accesses and instruction decoding can be eradicated. Experiments on a set of embedded benchmarks show that our proposed DLIC scheme can reduce energy consumption by up to 87%. On average, 66% energy can be saved on instruction fetching and decoding, at a performance overhead of only 1.4%.