{"title":"正交存储器存取多处理系统的矢量计算","authors":"I. Scherson, Yiming Ma","doi":"10.1109/ARITH.1987.6158715","DOIUrl":null,"url":null,"abstract":"An Orthogonal Memory Access system allows a multiplicity of processors to concurrently access distinct rows or columns of a rectangular array of data elements. The resulting tightly-coupled multi-processing system is feasible with current technology and has even been suggested for VLSI as a “reduced mesh”. In this paper we introduce the architecture and concentrate on its application to a number of basic vector and numerical computations. Matrix multiplication, L-U decomposition, polynomial evaluation and solutions to linear systems and partial differential equations, all show a speed-up of 0(n) for a n-processor system. The flexibility in the choice of the number of PEs makes the architecture a strong competitor in the world of special-purpose parallel systems. Actually, we prove that the machine exhibits the same performance as any other system with the same number of processors within a factor of 3.","PeriodicalId":424620,"journal":{"name":"1987 IEEE 8th Symposium on Computer Arithmetic (ARITH)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1987-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Vector computations on an orthogonal memory access multiprocessing system\",\"authors\":\"I. Scherson, Yiming Ma\",\"doi\":\"10.1109/ARITH.1987.6158715\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An Orthogonal Memory Access system allows a multiplicity of processors to concurrently access distinct rows or columns of a rectangular array of data elements. The resulting tightly-coupled multi-processing system is feasible with current technology and has even been suggested for VLSI as a “reduced mesh”. In this paper we introduce the architecture and concentrate on its application to a number of basic vector and numerical computations. Matrix multiplication, L-U decomposition, polynomial evaluation and solutions to linear systems and partial differential equations, all show a speed-up of 0(n) for a n-processor system. The flexibility in the choice of the number of PEs makes the architecture a strong competitor in the world of special-purpose parallel systems. Actually, we prove that the machine exhibits the same performance as any other system with the same number of processors within a factor of 3.\",\"PeriodicalId\":424620,\"journal\":{\"name\":\"1987 IEEE 8th Symposium on Computer Arithmetic (ARITH)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1987-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"1987 IEEE 8th Symposium on Computer Arithmetic (ARITH)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARITH.1987.6158715\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"1987 IEEE 8th Symposium on Computer Arithmetic (ARITH)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARITH.1987.6158715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Vector computations on an orthogonal memory access multiprocessing system
An Orthogonal Memory Access system allows a multiplicity of processors to concurrently access distinct rows or columns of a rectangular array of data elements. The resulting tightly-coupled multi-processing system is feasible with current technology and has even been suggested for VLSI as a “reduced mesh”. In this paper we introduce the architecture and concentrate on its application to a number of basic vector and numerical computations. Matrix multiplication, L-U decomposition, polynomial evaluation and solutions to linear systems and partial differential equations, all show a speed-up of 0(n) for a n-processor system. The flexibility in the choice of the number of PEs makes the architecture a strong competitor in the world of special-purpose parallel systems. Actually, we prove that the machine exhibits the same performance as any other system with the same number of processors within a factor of 3.