Michael A. Baker, Amrit Panda, Nikhil Ghadge, Aniruddha Kadne, Karam S. Chatha
{"title":"A performance model and code overlay generator for scratchpad enhanced embedded processors","authors":"Michael A. Baker, Amrit Panda, Nikhil Ghadge, Aniruddha Kadne, Karam S. Chatha","doi":"10.1145/1878961.1879011","DOIUrl":null,"url":null,"abstract":"Software managed scratchpad memories (SPMs) provide improved performance and power in embedded processors by reducing required hardware resources. Performance depends strongly on the scheme used to map code and data onto the SPM, but generating optimal mappings can be extremely difficult. Here we address instruction mapping on SPMs and present a performance model and algorithm, “Code Overlay Generator” (COG), for producing high performance dynamic SPM code mappings. Our heuristic does not require profiling information, and is suitable for generating mapping solutions for large programs which are otherwise infeasible using previously proposed Integer Linear Programming (ILP) techniques. We compare our algorithm with a published heuristic and the code overlay mapping algorithm provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. We find an average performance advantage of 34% compared to the previous algorithm, and 87% with respect to spu-gcc. We additionally show that our performance model enables improved tools for offline evaluation of code overlay performance and mapping selection.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1878961.1879011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21
Abstract
Software managed scratchpad memories (SPMs) provide improved performance and power in embedded processors by reducing required hardware resources. Performance depends strongly on the scheme used to map code and data onto the SPM, but generating optimal mappings can be extremely difficult. Here we address instruction mapping on SPMs and present a performance model and algorithm, “Code Overlay Generator” (COG), for producing high performance dynamic SPM code mappings. Our heuristic does not require profiling information, and is suitable for generating mapping solutions for large programs which are otherwise infeasible using previously proposed Integer Linear Programming (ILP) techniques. We compare our algorithm with a published heuristic and the code overlay mapping algorithm provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. We find an average performance advantage of 34% compared to the previous algorithm, and 87% with respect to spu-gcc. We additionally show that our performance model enables improved tools for offline evaluation of code overlay performance and mapping selection.