Communication Optimizations for Parallel Computing Using Data Access Information

M. Rinard
{"title":"Communication Optimizations for Parallel Computing Using Data Access Information","authors":"M. Rinard","doi":"10.1145/224170.224413","DOIUrl":null,"url":null,"abstract":"Given the large communication overheads characteristic of modern parallel machines, optimizations that eliminate, hide or parallelize communication may improve the performance of parallel computations. This paper describes our experience automatically applying communication optimizations in the context of Jade, a portable, implicitly parallel programming language designed for exploiting task-level concurrency. Jade programmers start with a program written in a standard serial, imperative language, then use Jade constructs to declare how parts of the program access data. The Jade implementation uses this data access information to automatically extract the concurrency and apply communication optimizations. Jade implementations exist for both shared memory and message passing machines; each Jade implementation applies communication optimizations appropriate for the machine on which it runs. We present performance results for several Jade applications running on both a shared memory machine (the Stanford DASH machine) and a message passing machine (the Intel iPSC/860). We use these results to characterize the overall performance impact of the communication optimizations. For our application set replicating data for concurrent read access and improving the locality of the computation by placing tasks close to the data that they access are the most important optimizations. Broadcasting widely accessed data has a significant performance impact on one application; other optimizations such as concurrently fetching remote data and overlapping computation with communication have no effect.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the IEEE/ACM SC95 Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/224170.224413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Given the large communication overheads characteristic of modern parallel machines, optimizations that eliminate, hide or parallelize communication may improve the performance of parallel computations. This paper describes our experience automatically applying communication optimizations in the context of Jade, a portable, implicitly parallel programming language designed for exploiting task-level concurrency. Jade programmers start with a program written in a standard serial, imperative language, then use Jade constructs to declare how parts of the program access data. The Jade implementation uses this data access information to automatically extract the concurrency and apply communication optimizations. Jade implementations exist for both shared memory and message passing machines; each Jade implementation applies communication optimizations appropriate for the machine on which it runs. We present performance results for several Jade applications running on both a shared memory machine (the Stanford DASH machine) and a message passing machine (the Intel iPSC/860). We use these results to characterize the overall performance impact of the communication optimizations. For our application set replicating data for concurrent read access and improving the locality of the computation by placing tasks close to the data that they access are the most important optimizations. Broadcasting widely accessed data has a significant performance impact on one application; other optimizations such as concurrently fetching remote data and overlapping computation with communication have no effect.
基于数据访问信息的并行计算通信优化
考虑到现代并行机器的大通信开销特征,消除、隐藏或并行化通信的优化可能会提高并行计算的性能。本文描述了我们在Jade上下文中自动应用通信优化的经验,Jade是一种可移植的隐式并行编程语言,旨在利用任务级并发性。Jade程序员从用标准的串行命令式语言编写程序开始,然后使用Jade结构来声明程序的各个部分如何访问数据。Jade实现使用此数据访问信息自动提取并发性并应用通信优化。共享内存和消息传递机器都有Jade实现;每个Jade实现都应用适合其运行的机器的通信优化。我们展示了在共享内存机器(Stanford DASH机器)和消息传递机器(Intel iPSC/860)上运行的几个Jade应用程序的性能结果。我们使用这些结果来描述通信优化的总体性能影响。对于我们的应用程序,最重要的优化是为并发读访问复制数据,并通过将任务放置在它们访问的数据附近来提高计算的局部性。广播广泛访问的数据对一个应用程序有显著的性能影响;其他优化,如并发获取远程数据和通信重叠计算没有效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信