{"title":"嵌入式异构多核DSP架构ePUMA的高效并行存储器访问编程实例研究","authors":"E. Hansson, Joar Sohl, C. Kessler, Dake Liu","doi":"10.1109/CISIS.2011.103","DOIUrl":null,"url":null,"abstract":"We consider the challenges in writing efficient code for ePUMA, a novel domain-specific heterogeneous multicore architecture with SIMD DSP slave cores, multi-banked on-chip vector register files for parallel access and configurable permutation hardware that decouples memory access from computation. Suitable data layout in memory and in vector registers, combined with using ePUMA's powerful addressing modes, is key to exploiting SIMD units efficiently and achieving the throughput required for prospective applications in 4G mobile telecommunication and multimedia.","PeriodicalId":203206,"journal":{"name":"2011 International Conference on Complex, Intelligent, and Software Intensive Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Case Study of Efficient Parallel Memory Access Programming for the Embedded Heterogeneous Multicore DSP Architecture ePUMA\",\"authors\":\"E. Hansson, Joar Sohl, C. Kessler, Dake Liu\",\"doi\":\"10.1109/CISIS.2011.103\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the challenges in writing efficient code for ePUMA, a novel domain-specific heterogeneous multicore architecture with SIMD DSP slave cores, multi-banked on-chip vector register files for parallel access and configurable permutation hardware that decouples memory access from computation. Suitable data layout in memory and in vector registers, combined with using ePUMA's powerful addressing modes, is key to exploiting SIMD units efficiently and achieving the throughput required for prospective applications in 4G mobile telecommunication and multimedia.\",\"PeriodicalId\":203206,\"journal\":{\"name\":\"2011 International Conference on Complex, Intelligent, and Software Intensive Systems\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Complex, Intelligent, and Software Intensive Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISIS.2011.103\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Complex, Intelligent, and Software Intensive Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISIS.2011.103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Case Study of Efficient Parallel Memory Access Programming for the Embedded Heterogeneous Multicore DSP Architecture ePUMA
We consider the challenges in writing efficient code for ePUMA, a novel domain-specific heterogeneous multicore architecture with SIMD DSP slave cores, multi-banked on-chip vector register files for parallel access and configurable permutation hardware that decouples memory access from computation. Suitable data layout in memory and in vector registers, combined with using ePUMA's powerful addressing modes, is key to exploiting SIMD units efficiently and achieving the throughput required for prospective applications in 4G mobile telecommunication and multimedia.