减少并行仿真中的同步开销

Ulana Legedza, W. Weihl
{"title":"减少并行仿真中的同步开销","authors":"Ulana Legedza, W. Weihl","doi":"10.1145/238788.238822","DOIUrl":null,"url":null,"abstract":"Synchronization is often the dominant cost in conservative parallel simulation, particularly in simulations of parallel computers, in which low-latency simulated communication requires frequent synchronization. We present and evaluate LOCAL BARRIERS and PREDICTIVE BARRIER SCHEDULING, two techniques for reducing synchronization overhead in the simulation of message-passing multicomputers. Local barriers use nearest-neighbor synchronization to reduce waiting time at synchronization points. Predictive barrier scheduling, a novel technique that schedules synchronizations using both compile-time and runtime analysis, reduces the frequency of synchronization operations. In contrast to other work in this area, both techniques reduce synchronization overhead without decreasing the accuracy of network simulation. These techniques were evaluated by comparing their performance to that of periodic global synchronization. Experiments show that local barriers improve performance by up to 24% for communication-bound applications, while predictive barrier scheduling improves performance by up to 65% for applications with long local computation phases. Because the two techniques are complementary, we advocate a combined approach. This work was done in the context of PARALLEL PROTEUS, a new parallel simulator of message-passing multicomputers.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"Reducing Synchronization Overhead in Parallel Simulation\",\"authors\":\"Ulana Legedza, W. Weihl\",\"doi\":\"10.1145/238788.238822\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Synchronization is often the dominant cost in conservative parallel simulation, particularly in simulations of parallel computers, in which low-latency simulated communication requires frequent synchronization. We present and evaluate LOCAL BARRIERS and PREDICTIVE BARRIER SCHEDULING, two techniques for reducing synchronization overhead in the simulation of message-passing multicomputers. Local barriers use nearest-neighbor synchronization to reduce waiting time at synchronization points. Predictive barrier scheduling, a novel technique that schedules synchronizations using both compile-time and runtime analysis, reduces the frequency of synchronization operations. In contrast to other work in this area, both techniques reduce synchronization overhead without decreasing the accuracy of network simulation. These techniques were evaluated by comparing their performance to that of periodic global synchronization. Experiments show that local barriers improve performance by up to 24% for communication-bound applications, while predictive barrier scheduling improves performance by up to 65% for applications with long local computation phases. Because the two techniques are complementary, we advocate a combined approach. This work was done in the context of PARALLEL PROTEUS, a new parallel simulator of message-passing multicomputers.\",\"PeriodicalId\":326232,\"journal\":{\"name\":\"Proceedings of Symposium on Parallel and Distributed Tools\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1996-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of Symposium on Parallel and Distributed Tools\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/238788.238822\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of Symposium on Parallel and Distributed Tools","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/238788.238822","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 48

摘要

同步通常是保守并行仿真的主要成本,特别是在并行计算机仿真中,其中低延迟模拟通信需要频繁的同步。我们提出并评估了本地屏障和预测屏障调度,这两种技术用于减少消息传递多计算机仿真中的同步开销。本地屏障使用最近邻同步来减少同步点的等待时间。预测性屏障调度是一种使用编译时和运行时分析来调度同步的新技术,它减少了同步操作的频率。与该领域的其他工作相比,这两种技术都减少了同步开销,而不会降低网络模拟的准确性。通过将这些技术的性能与周期性全局同步的性能进行比较,对这些技术进行了评估。实验表明,对于通信绑定应用程序,本地屏障可将性能提高24%,而对于具有长本地计算阶段的应用程序,预测性屏障调度可将性能提高65%。因为这两种技术是互补的,所以我们提倡一种结合的方法。这项工作是在PARALLEL PROTEUS的背景下完成的,PARALLEL PROTEUS是一种新的消息传递多计算机并行模拟器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Reducing Synchronization Overhead in Parallel Simulation
Synchronization is often the dominant cost in conservative parallel simulation, particularly in simulations of parallel computers, in which low-latency simulated communication requires frequent synchronization. We present and evaluate LOCAL BARRIERS and PREDICTIVE BARRIER SCHEDULING, two techniques for reducing synchronization overhead in the simulation of message-passing multicomputers. Local barriers use nearest-neighbor synchronization to reduce waiting time at synchronization points. Predictive barrier scheduling, a novel technique that schedules synchronizations using both compile-time and runtime analysis, reduces the frequency of synchronization operations. In contrast to other work in this area, both techniques reduce synchronization overhead without decreasing the accuracy of network simulation. These techniques were evaluated by comparing their performance to that of periodic global synchronization. Experiments show that local barriers improve performance by up to 24% for communication-bound applications, while predictive barrier scheduling improves performance by up to 65% for applications with long local computation phases. Because the two techniques are complementary, we advocate a combined approach. This work was done in the context of PARALLEL PROTEUS, a new parallel simulator of message-passing multicomputers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信