{"title":"高性能大规模并行SIMD系统的处理器元件设计","authors":"D. Beal, C. Lambrinoudakis","doi":"10.1142/S0129053395000208","DOIUrl":null,"url":null,"abstract":"This paper describes the architecture of the General Purpose with Floating Point support (GPFP) processing element, which uses the expansion of circuitry from VLSI advances to provide on-chip memory and cost-effective extra functionality. A major goal was to accelerate floating point arithmetic. This was combined with architectural aims of cost-effectiveness, achieving the floating-point capability from general-purpose units, and retaining the 1-bit manipulations available in the earlier generation. With a 50 MHz clock each PE is capable of 2.5 MegaFlops. Normalized to the same clock rate, the GPFP PE exceeds first generation PEs by far, namely the DAP by a factor of 50 and the MPP by a factor of 20, and also outperforms the recent MasPar design by a factor of four. A 32×32 GPFP array is capable of up to 2.5 GigaFlops and 6500 MIPS, on 32-bit additions. These speedups are obtained by architectural features rather than increased width of data-handling and are combined with parsimonious use of circuitry compatible with massively parallel fabrication. The GPFP also incorporates Reconfigurable Local Control (RLC), a technique that combines a considerable degree of local autonomy within PEs and microcode flexibility, giving the machine improved general-purpose programmability in addition to floating-point numerical performance.","PeriodicalId":270006,"journal":{"name":"Int. J. High Speed Comput.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design of a Processor Element for a High Performance Massively Parallel SIMD System\",\"authors\":\"D. Beal, C. Lambrinoudakis\",\"doi\":\"10.1142/S0129053395000208\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes the architecture of the General Purpose with Floating Point support (GPFP) processing element, which uses the expansion of circuitry from VLSI advances to provide on-chip memory and cost-effective extra functionality. A major goal was to accelerate floating point arithmetic. This was combined with architectural aims of cost-effectiveness, achieving the floating-point capability from general-purpose units, and retaining the 1-bit manipulations available in the earlier generation. With a 50 MHz clock each PE is capable of 2.5 MegaFlops. Normalized to the same clock rate, the GPFP PE exceeds first generation PEs by far, namely the DAP by a factor of 50 and the MPP by a factor of 20, and also outperforms the recent MasPar design by a factor of four. A 32×32 GPFP array is capable of up to 2.5 GigaFlops and 6500 MIPS, on 32-bit additions. These speedups are obtained by architectural features rather than increased width of data-handling and are combined with parsimonious use of circuitry compatible with massively parallel fabrication. The GPFP also incorporates Reconfigurable Local Control (RLC), a technique that combines a considerable degree of local autonomy within PEs and microcode flexibility, giving the machine improved general-purpose programmability in addition to floating-point numerical performance.\",\"PeriodicalId\":270006,\"journal\":{\"name\":\"Int. J. High Speed Comput.\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. High Speed Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/S0129053395000208\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. High Speed Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/S0129053395000208","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design of a Processor Element for a High Performance Massively Parallel SIMD System
This paper describes the architecture of the General Purpose with Floating Point support (GPFP) processing element, which uses the expansion of circuitry from VLSI advances to provide on-chip memory and cost-effective extra functionality. A major goal was to accelerate floating point arithmetic. This was combined with architectural aims of cost-effectiveness, achieving the floating-point capability from general-purpose units, and retaining the 1-bit manipulations available in the earlier generation. With a 50 MHz clock each PE is capable of 2.5 MegaFlops. Normalized to the same clock rate, the GPFP PE exceeds first generation PEs by far, namely the DAP by a factor of 50 and the MPP by a factor of 20, and also outperforms the recent MasPar design by a factor of four. A 32×32 GPFP array is capable of up to 2.5 GigaFlops and 6500 MIPS, on 32-bit additions. These speedups are obtained by architectural features rather than increased width of data-handling and are combined with parsimonious use of circuitry compatible with massively parallel fabrication. The GPFP also incorporates Reconfigurable Local Control (RLC), a technique that combines a considerable degree of local autonomy within PEs and microcode flexibility, giving the machine improved general-purpose programmability in addition to floating-point numerical performance.