{"title":"Towards the empirical design of massively parallel arrays for spatially mapped applications","authors":"M. Herbordt, C. Weems","doi":"10.1109/CAMP.1995.521020","DOIUrl":null,"url":null,"abstract":"Although SIMD arrays have been built since the 1960's, they have undergone few empirical studies. The underlying problems-which have included the lack of a unified architectural framework and the computational intractability of simulating large PE arrays-are addressed through the use of trace compilation, a novel approach to trace driven simulation. The results indicate the benefits of adding another level to current SIMD array memory designs. Also, surprising results were obtained about performance effects of varying cache associativity and block size. Together, they indicate that while SIMD array programs have sufficient locality to make PE caches worthwhile, the type of locality may differ fundamentally from that of serial machine and multiprocessor programs. Other results demonstrate the limitations of increasing the datapath width and inter PE communication bandwidth without corresponding improvements in other processor features.","PeriodicalId":277209,"journal":{"name":"Proceedings of Conference on Computer Architectures for Machine Perception","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of Conference on Computer Architectures for Machine Perception","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAMP.1995.521020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Although SIMD arrays have been built since the 1960's, they have undergone few empirical studies. The underlying problems-which have included the lack of a unified architectural framework and the computational intractability of simulating large PE arrays-are addressed through the use of trace compilation, a novel approach to trace driven simulation. The results indicate the benefits of adding another level to current SIMD array memory designs. Also, surprising results were obtained about performance effects of varying cache associativity and block size. Together, they indicate that while SIMD array programs have sufficient locality to make PE caches worthwhile, the type of locality may differ fundamentally from that of serial machine and multiprocessor programs. Other results demonstrate the limitations of increasing the datapath width and inter PE communication bandwidth without corresponding improvements in other processor features.