{"title":"A quantitative analysis of locality in dataflow programs","authors":"W. M. Miller, W. Najjar, A. Böhm","doi":"10.1145/123465.123469","DOIUrl":null,"url":null,"abstract":"Substantial evidence suggests that exploiting some forms of locality within datajiow programs can impact performance dramatically. This is the basic premise of several hybrid von Neumann-dataflow or multithreaded architectures. Identifying and exploiting locality, however, in a jine-grained asynchronous execution model is not trivial. In this paper, jine grained intra-thread locality is defined, quantified and evaitiated. These experimental measurements are based on the evaluation of a set of numer+c and non-numeric benchmarks. The results point to a very large degree of thread locality: for example, over 70% of the instructions have to wait tess than 5 instruction execution steps for their input data. Furthermore, the remarkable uniformity and consistency of the distti”bution of thread locality across a wide vam”ety of benchmarks suggests that thread locality is highly dependent on the instruction set.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MICRO 24","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/123465.123469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Substantial evidence suggests that exploiting some forms of locality within datajiow programs can impact performance dramatically. This is the basic premise of several hybrid von Neumann-dataflow or multithreaded architectures. Identifying and exploiting locality, however, in a jine-grained asynchronous execution model is not trivial. In this paper, jine grained intra-thread locality is defined, quantified and evaitiated. These experimental measurements are based on the evaluation of a set of numer+c and non-numeric benchmarks. The results point to a very large degree of thread locality: for example, over 70% of the instructions have to wait tess than 5 instruction execution steps for their input data. Furthermore, the remarkable uniformity and consistency of the distti”bution of thread locality across a wide vam”ety of benchmarks suggests that thread locality is highly dependent on the instruction set.