{"title":"Quantitative studies of processing element granularity","authors":"T. C. Marek, E. Davis","doi":"10.1109/FMPC.1992.234925","DOIUrl":null,"url":null,"abstract":"Quantitative results of experiments on PE (processing element) granularities are presented. An architecture simulation workbench has been developed for experiments on PE granularities of 1, 4, 8, and 16-b. An analysis of the impact of various I/O (input/output) and communication path widths is also possible. Overall performance, communication balance, PE utilization, and operand lengths can be monitored to evaluate the merits of various granularities and feature sets. This workbench has been used to run a set of benchmark algorithms that cover a range of computation and communication requirements, a range of data sizes, and a range of problem array sizes. The authors report results for two of the algorithms studied by T.C. Marek (1992): image rotation and image resampling. The results obtained are counterintuitive. They indicate that bit-serial machines have performance advantages due to inherent bit-oriented activity, even when using multiple bit operands, and to inter-PE communication when paths are narrower than the processor granularity.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"78 12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FMPC.1992.234925","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Quantitative results of experiments on PE (processing element) granularities are presented. An architecture simulation workbench has been developed for experiments on PE granularities of 1, 4, 8, and 16-b. An analysis of the impact of various I/O (input/output) and communication path widths is also possible. Overall performance, communication balance, PE utilization, and operand lengths can be monitored to evaluate the merits of various granularities and feature sets. This workbench has been used to run a set of benchmark algorithms that cover a range of computation and communication requirements, a range of data sizes, and a range of problem array sizes. The authors report results for two of the algorithms studied by T.C. Marek (1992): image rotation and image resampling. The results obtained are counterintuitive. They indicate that bit-serial machines have performance advantages due to inherent bit-oriented activity, even when using multiple bit operands, and to inter-PE communication when paths are narrower than the processor granularity.<>