{"title":"SIMD阵列架构中数据路径、内存层级和网络的实证研究","authors":"M. Herbordt, C. Weems","doi":"10.1109/ICCD.1995.528921","DOIUrl":null,"url":null,"abstract":"Although SIMD arrays have been built for 30 years, they have as a class been the subject of few empirical design studies. Using ENPASSANT, a simulation environment developed for that purpose, we analyze several aspects of SIMD array architecture with respect to a test suite of spatially mapped applications. Several surprising results are obtained. With respect to memory hierarchy, we find that adding a level of cache to current PE designs is likely to be advantageous, but that such a cache will look quite different than expected. In particular, we find that associativity has unusual significance and that performance varies inversely with block size. Router network results indicate the importance of support for local transfers, broadcast, and reduction even at the expense of arbitrary permutations. Other communication results point to the appropriate dimensionality of k-ary n-cube networks (2 or 3), and the criticality of supporting bidirectional transfers, even if the overall bandwidth remains unchanged.","PeriodicalId":281907,"journal":{"name":"Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"An empirical study of datapath, memory hierarchy, and network in SIMD array architectures\",\"authors\":\"M. Herbordt, C. Weems\",\"doi\":\"10.1109/ICCD.1995.528921\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although SIMD arrays have been built for 30 years, they have as a class been the subject of few empirical design studies. Using ENPASSANT, a simulation environment developed for that purpose, we analyze several aspects of SIMD array architecture with respect to a test suite of spatially mapped applications. Several surprising results are obtained. With respect to memory hierarchy, we find that adding a level of cache to current PE designs is likely to be advantageous, but that such a cache will look quite different than expected. In particular, we find that associativity has unusual significance and that performance varies inversely with block size. Router network results indicate the importance of support for local transfers, broadcast, and reduction even at the expense of arbitrary permutations. Other communication results point to the appropriate dimensionality of k-ary n-cube networks (2 or 3), and the criticality of supporting bidirectional transfers, even if the overall bandwidth remains unchanged.\",\"PeriodicalId\":281907,\"journal\":{\"name\":\"Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCD.1995.528921\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.1995.528921","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An empirical study of datapath, memory hierarchy, and network in SIMD array architectures
Although SIMD arrays have been built for 30 years, they have as a class been the subject of few empirical design studies. Using ENPASSANT, a simulation environment developed for that purpose, we analyze several aspects of SIMD array architecture with respect to a test suite of spatially mapped applications. Several surprising results are obtained. With respect to memory hierarchy, we find that adding a level of cache to current PE designs is likely to be advantageous, but that such a cache will look quite different than expected. In particular, we find that associativity has unusual significance and that performance varies inversely with block size. Router network results indicate the importance of support for local transfers, broadcast, and reduction even at the expense of arbitrary permutations. Other communication results point to the appropriate dimensionality of k-ary n-cube networks (2 or 3), and the criticality of supporting bidirectional transfers, even if the overall bandwidth remains unchanged.