{"title":"Case Study of Efficient Parallel Memory Access Programming for the Embedded Heterogeneous Multicore DSP Architecture ePUMA","authors":"E. Hansson, Joar Sohl, C. Kessler, Dake Liu","doi":"10.1109/CISIS.2011.103","DOIUrl":null,"url":null,"abstract":"We consider the challenges in writing efficient code for ePUMA, a novel domain-specific heterogeneous multicore architecture with SIMD DSP slave cores, multi-banked on-chip vector register files for parallel access and configurable permutation hardware that decouples memory access from computation. Suitable data layout in memory and in vector registers, combined with using ePUMA's powerful addressing modes, is key to exploiting SIMD units efficiently and achieving the throughput required for prospective applications in 4G mobile telecommunication and multimedia.","PeriodicalId":203206,"journal":{"name":"2011 International Conference on Complex, Intelligent, and Software Intensive Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Complex, Intelligent, and Software Intensive Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISIS.2011.103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We consider the challenges in writing efficient code for ePUMA, a novel domain-specific heterogeneous multicore architecture with SIMD DSP slave cores, multi-banked on-chip vector register files for parallel access and configurable permutation hardware that decouples memory access from computation. Suitable data layout in memory and in vector registers, combined with using ePUMA's powerful addressing modes, is key to exploiting SIMD units efficiently and achieving the throughput required for prospective applications in 4G mobile telecommunication and multimedia.