{"title":"Parallel Memory Implementation for Arbitrary Stride Accesses","authors":"E. Aho, Jarno Vanne, T. Hämäläinen","doi":"10.1109/ICSAMOS.2006.300801","DOIUrl":null,"url":null,"abstract":"Parallel memory modules can be used to increase memory bandwidth and feed a processor with only necessary data. Arbitrary stride access capability with interleaved memories is described in previous research where the skewing scheme is changed at run time according to the currently used stride. This paper presents the improved schemes which are adapted to parallel memories. The proposed novel parallel memory implementation allows conflict free accesses with all the constant strides which has not been possible in prior application specific parallel memories. Moreover, the possible access locations are unrestricted and the data patterns have equal amount of accessed data elements as the number of memory modules. Timing and area estimates are given for Altera Stratix FPGA and 0.18 micrometer CMOS process with memory module count from 2 to 32. The FPGA results show 129 MHz clock frequency for a system with 16 memory modules when read and write latencies are 3 and 2 clock cycles, respectively","PeriodicalId":204190,"journal":{"name":"2006 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation","volume":"127 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAMOS.2006.300801","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Parallel memory modules can be used to increase memory bandwidth and feed a processor with only necessary data. Arbitrary stride access capability with interleaved memories is described in previous research where the skewing scheme is changed at run time according to the currently used stride. This paper presents the improved schemes which are adapted to parallel memories. The proposed novel parallel memory implementation allows conflict free accesses with all the constant strides which has not been possible in prior application specific parallel memories. Moreover, the possible access locations are unrestricted and the data patterns have equal amount of accessed data elements as the number of memory modules. Timing and area estimates are given for Altera Stratix FPGA and 0.18 micrometer CMOS process with memory module count from 2 to 32. The FPGA results show 129 MHz clock frequency for a system with 16 memory modules when read and write latencies are 3 and 2 clock cycles, respectively