数据流程序中局部性的定量分析

MICRO 24 Pub Date : 1991-09-01 DOI:10.1145/123465.123469

W. M. Miller, W. Najjar, A. Böhm

{"title":"数据流程序中局部性的定量分析","authors":"W. M. Miller, W. Najjar, A. Böhm","doi":"10.1145/123465.123469","DOIUrl":null,"url":null,"abstract":"Substantial evidence suggests that exploiting some forms of locality within datajiow programs can impact performance dramatically. This is the basic premise of several hybrid von Neumann-dataflow or multithreaded architectures. Identifying and exploiting locality, however, in a jine-grained asynchronous execution model is not trivial. In this paper, jine grained intra-thread locality is defined, quantified and evaitiated. These experimental measurements are based on the evaluation of a set of numer+c and non-numeric benchmarks. The results point to a very large degree of thread locality: for example, over 70% of the instructions have to wait tess than 5 instruction execution steps for their input data. Furthermore, the remarkable uniformity and consistency of the distti”bution of thread locality across a wide vam”ety of benchmarks suggests that thread locality is highly dependent on the instruction set.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"A quantitative analysis of locality in dataflow programs\",\"authors\":\"W. M. Miller, W. Najjar, A. Böhm\",\"doi\":\"10.1145/123465.123469\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Substantial evidence suggests that exploiting some forms of locality within datajiow programs can impact performance dramatically. This is the basic premise of several hybrid von Neumann-dataflow or multithreaded architectures. Identifying and exploiting locality, however, in a jine-grained asynchronous execution model is not trivial. In this paper, jine grained intra-thread locality is defined, quantified and evaitiated. These experimental measurements are based on the evaluation of a set of numer+c and non-numeric benchmarks. The results point to a very large degree of thread locality: for example, over 70% of the instructions have to wait tess than 5 instruction execution steps for their input data. Furthermore, the remarkable uniformity and consistency of the distti”bution of thread locality across a wide vam”ety of benchmarks suggests that thread locality is highly dependent on the instruction set.\",\"PeriodicalId\":118572,\"journal\":{\"name\":\"MICRO 24\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1991-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MICRO 24\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/123465.123469\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MICRO 24","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/123465.123469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

大量证据表明，在datajiow程序中利用某些形式的局部性可以极大地影响性能。这是几种混合冯诺依曼数据流或多线程架构的基本前提。然而，在细粒度异步执行模型中识别和利用局部性并非易事。本文定义、量化和评价了细粒度线程内局部性。这些实验测量是基于一组数值+c和非数值基准的评估。结果表明线程局部性非常高:例如，超过70%的指令必须等待少于5个指令执行步骤才能获得输入数据。此外，线程局部性分布在各种基准测试中的显著一致性表明，线程局部性高度依赖于指令集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A quantitative analysis of locality in dataflow programs

Substantial evidence suggests that exploiting some forms of locality within datajiow programs can impact performance dramatically. This is the basic premise of several hybrid von Neumann-dataflow or multithreaded architectures. Identifying and exploiting locality, however, in a jine-grained asynchronous execution model is not trivial. In this paper, jine grained intra-thread locality is defined, quantified and evaitiated. These experimental measurements are based on the evaluation of a set of numer+c and non-numeric benchmarks. The results point to a very large degree of thread locality: for example, over 70% of the instructions have to wait tess than 5 instruction execution steps for their input data. Furthermore, the remarkable uniformity and consistency of the distti”bution of thread locality across a wide vam”ety of benchmarks suggests that thread locality is highly dependent on the instruction set.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

MICRO 24

自引率

0.00%

发文量