Manuel Mohr, Artjom Grudnitsky, Tobias Modschiedler, L. Bauer, Sebastian Hack, J. Henkel
{"title":"Hardware acceleration for programs in SSA form","authors":"Manuel Mohr, Artjom Grudnitsky, Tobias Modschiedler, L. Bauer, Sebastian Hack, J. Henkel","doi":"10.1109/CASES.2013.6662518","DOIUrl":null,"url":null,"abstract":"Register allocation is one of the most time-consuming parts of the compilation process. Depending on the quality of the register allocation, a large amount of shuffle code to move values between registers is generated. In this paper, we propose a processor architecture extension to provide register file permutations by which the shuffle code can be implemented more efficiently. We present compiler support to utilize this extension, an evaluation regarding performance and compilation time using the SPEC CINT2000 benchmark, as well as an analysis of area and frequency overhead of our architecture implementation. We find that using our extension, the number of executed instructions is reduced by up to 5.1 % while the compilation time is unaffected.","PeriodicalId":354180,"journal":{"name":"2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CASES.2013.6662518","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Register allocation is one of the most time-consuming parts of the compilation process. Depending on the quality of the register allocation, a large amount of shuffle code to move values between registers is generated. In this paper, we propose a processor architecture extension to provide register file permutations by which the shuffle code can be implemented more efficiently. We present compiler support to utilize this extension, an evaluation regarding performance and compilation time using the SPEC CINT2000 benchmark, as well as an analysis of area and frequency overhead of our architecture implementation. We find that using our extension, the number of executed instructions is reduced by up to 5.1 % while the compilation time is unaffected.