{"title":"矢量存储系统的有效实现技术","authors":"T. Chiueh, Manish Verma, Sanjay A. Padubidri","doi":"10.1109/ISPAN.1994.367139","DOIUrl":null,"url":null,"abstract":"Existing vector machines' memory systems use heavy interleaving and SRAM technology for faster data access. In this paper, we present an efficient vector memory architecture that adopts prime-degree memory interleaving and exploits the capabilities of new-generation DRAM chips with small SRAM cache. The major contribution of this paper is an incremental indexing scheme for prime-degree memory interleaving that takes at most two integer divisions as the initial start-up overhead for each logical vector memory access, and generates one bank/offset address pair per cycle thereafter. We have also developed a vector pre-fetching scheme that ensures that vector data elements are in the SRAM buffers before they are accessed, thus effectively masking the long delays associated with normal DRAM accesses.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"434 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Efficient implementation techniques for vector memory systems\",\"authors\":\"T. Chiueh, Manish Verma, Sanjay A. Padubidri\",\"doi\":\"10.1109/ISPAN.1994.367139\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing vector machines' memory systems use heavy interleaving and SRAM technology for faster data access. In this paper, we present an efficient vector memory architecture that adopts prime-degree memory interleaving and exploits the capabilities of new-generation DRAM chips with small SRAM cache. The major contribution of this paper is an incremental indexing scheme for prime-degree memory interleaving that takes at most two integer divisions as the initial start-up overhead for each logical vector memory access, and generates one bank/offset address pair per cycle thereafter. We have also developed a vector pre-fetching scheme that ensures that vector data elements are in the SRAM buffers before they are accessed, thus effectively masking the long delays associated with normal DRAM accesses.<<ETX>>\",\"PeriodicalId\":142405,\"journal\":{\"name\":\"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)\",\"volume\":\"434 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPAN.1994.367139\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPAN.1994.367139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient implementation techniques for vector memory systems
Existing vector machines' memory systems use heavy interleaving and SRAM technology for faster data access. In this paper, we present an efficient vector memory architecture that adopts prime-degree memory interleaving and exploits the capabilities of new-generation DRAM chips with small SRAM cache. The major contribution of this paper is an incremental indexing scheme for prime-degree memory interleaving that takes at most two integer divisions as the initial start-up overhead for each logical vector memory access, and generates one bank/offset address pair per cycle thereafter. We have also developed a vector pre-fetching scheme that ensures that vector data elements are in the SRAM buffers before they are accessed, thus effectively masking the long delays associated with normal DRAM accesses.<>