Qi Guo, A. L. Sartor, Anthony Brandon, A. C. S. Beck, Xuehai Zhou, Stephan Wong
{"title":"可重构VLIW处理器的运行时相位预测","authors":"Qi Guo, A. L. Sartor, Anthony Brandon, A. C. S. Beck, Xuehai Zhou, Stephan Wong","doi":"10.3850/9783981537079_0644","DOIUrl":null,"url":null,"abstract":"It is well-known that different applications exhibit varying amounts of ILP. Execution of these applications on the same fixed-width VLIW processor will result (1) in wasted energy due to underutilized resources if the issue-width of the processor is larger than the inherent ILP; or alternatively, (2) in lower performance if the issue-width is smaller than the inherent ILP. Moreover, even within a single application distinct phases can be observed with varying ILP and therefore changing resource requirements. With this in mind, we designed the ρ-VEX processor, which is a VLIW processor that can change its issue-width at run-time. In this paper, we propose a novel scheme to dynamically (i.e., at run-time) optimize the resource utilization by predicting and matching the number of active data-paths for each application phase. The purpose is to achieve low energy consumption for applications with low ILP, and high performance for applications with high ILP, on a single VLIW processor design. We prototyped the ρ-VEX processor on an FPGA and obtained the dynamic traces of applications running on top of a Linux port. Our results show that it is possible in some cases to achieve the performance of an 8-issue core with 10% lower energy consumption, while in others we achieve the energy consumption of a 2-issue core with close to 20% lower execution time.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Run-time phase prediction for a reconfigurable VLIW processor\",\"authors\":\"Qi Guo, A. L. Sartor, Anthony Brandon, A. C. S. Beck, Xuehai Zhou, Stephan Wong\",\"doi\":\"10.3850/9783981537079_0644\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is well-known that different applications exhibit varying amounts of ILP. Execution of these applications on the same fixed-width VLIW processor will result (1) in wasted energy due to underutilized resources if the issue-width of the processor is larger than the inherent ILP; or alternatively, (2) in lower performance if the issue-width is smaller than the inherent ILP. Moreover, even within a single application distinct phases can be observed with varying ILP and therefore changing resource requirements. With this in mind, we designed the ρ-VEX processor, which is a VLIW processor that can change its issue-width at run-time. In this paper, we propose a novel scheme to dynamically (i.e., at run-time) optimize the resource utilization by predicting and matching the number of active data-paths for each application phase. The purpose is to achieve low energy consumption for applications with low ILP, and high performance for applications with high ILP, on a single VLIW processor design. We prototyped the ρ-VEX processor on an FPGA and obtained the dynamic traces of applications running on top of a Linux port. Our results show that it is possible in some cases to achieve the performance of an 8-issue core with 10% lower energy consumption, while in others we achieve the energy consumption of a 2-issue core with close to 20% lower execution time.\",\"PeriodicalId\":311352,\"journal\":{\"name\":\"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3850/9783981537079_0644\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3850/9783981537079_0644","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Run-time phase prediction for a reconfigurable VLIW processor
It is well-known that different applications exhibit varying amounts of ILP. Execution of these applications on the same fixed-width VLIW processor will result (1) in wasted energy due to underutilized resources if the issue-width of the processor is larger than the inherent ILP; or alternatively, (2) in lower performance if the issue-width is smaller than the inherent ILP. Moreover, even within a single application distinct phases can be observed with varying ILP and therefore changing resource requirements. With this in mind, we designed the ρ-VEX processor, which is a VLIW processor that can change its issue-width at run-time. In this paper, we propose a novel scheme to dynamically (i.e., at run-time) optimize the resource utilization by predicting and matching the number of active data-paths for each application phase. The purpose is to achieve low energy consumption for applications with low ILP, and high performance for applications with high ILP, on a single VLIW processor design. We prototyped the ρ-VEX processor on an FPGA and obtained the dynamic traces of applications running on top of a Linux port. Our results show that it is possible in some cases to achieve the performance of an 8-issue core with 10% lower energy consumption, while in others we achieve the energy consumption of a 2-issue core with close to 20% lower execution time.