Task scheduling with locality consideration for a clustered parallel FL reduction system

Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis Pub Date : 1995-03-15 DOI:10.1109/AISPAS.1995.401334

H. Shen, H. Kitajima, H. Kobayashi, T. Nakamura

{"title":"Task scheduling with locality consideration for a clustered parallel FL reduction system","authors":"H. Shen, H. Kitajima, H. Kobayashi, T. Nakamura","doi":"10.1109/AISPAS.1995.401334","DOIUrl":null,"url":null,"abstract":"Multiprocessor systems provide us with high performance surpassing sequential computers. When constructing a multiprocessor system, task scheduling is one of the crucial issues affecting the system performance. The paper studies task scheduling for a clustered parallel reduction system of the functional language FL. We construct a shared memory multiprocessor system to realize parallel graph reduction of FL programs. The processing elements PEs in the system are divided into several clusters, in each of which PEs are coupled through a local cache. Redexes with independent data are scheduled to different PEs, and are reduced simultaneously. In this system, the most critical problem is that too many memory accesses may restrict the scalability of the system performance. In order to solve this problem, we take the locality of references into account to keep the contents of a cluster cache available in successive redex evaluation steps. We also pay sufficient attention to the utilization of the PEs while handling the locality of references. As a result, both fewer memory accesses and lower PE idle ratios can be expected. We carry out software simulation to evaluate the system performance under the proposed task scheduling strategy. The simulation results are examined to illustrate the effectiveness of the proposed strategy.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AISPAS.1995.401334","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Multiprocessor systems provide us with high performance surpassing sequential computers. When constructing a multiprocessor system, task scheduling is one of the crucial issues affecting the system performance. The paper studies task scheduling for a clustered parallel reduction system of the functional language FL. We construct a shared memory multiprocessor system to realize parallel graph reduction of FL programs. The processing elements PEs in the system are divided into several clusters, in each of which PEs are coupled through a local cache. Redexes with independent data are scheduled to different PEs, and are reduced simultaneously. In this system, the most critical problem is that too many memory accesses may restrict the scalability of the system performance. In order to solve this problem, we take the locality of references into account to keep the contents of a cluster cache available in successive redex evaluation steps. We also pay sufficient attention to the utilization of the PEs while handling the locality of references. As a result, both fewer memory accesses and lower PE idle ratios can be expected. We carry out software simulation to evaluate the system performance under the proposed task scheduling strategy. The simulation results are examined to illustrate the effectiveness of the proposed strategy.<>

查看原文本刊更多论文

考虑局部性的集群并行FL约简系统任务调度

多处理器系统为我们提供了超越顺序计算机的高性能。在构建多处理器系统时，任务调度是影响系统性能的关键问题之一。本文研究了函数式语言FL的集群并行约简系统的任务调度问题，构造了一个共享内存多处理器系统来实现FL程序的并行图约简。系统中的处理元素pe被分成几个集群，在每个集群中pe通过本地缓存耦合。具有独立数据的redex被调度到不同的pe，并同时减少。在此系统中，最关键的问题是过多的内存访问可能会限制系统性能的可伸缩性。为了解决这个问题，我们考虑了引用的局部性，以便在连续的索引计算步骤中保持集群缓存的内容可用。在处理引用的局部性时，我们也对pe的利用给予了足够的关注。因此，可以预期更少的内存访问和更低的PE空闲比率。通过软件仿真，对所提出的任务调度策略下的系统性能进行了评估。仿真结果验证了所提策略的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis

自引率

0.00%

发文量