分布式内存系统的异步工作窃取

Shigang Li, Jingyuan Hu, Xin Cheng, Chongchong Zhao
{"title":"分布式内存系统的异步工作窃取","authors":"Shigang Li, Jingyuan Hu, Xin Cheng, Chongchong Zhao","doi":"10.1109/PDP.2013.35","DOIUrl":null,"url":null,"abstract":"Work stealing is a popular policy for dynamic load balancing of irregular applications. However, communication overhead incurred by work stealing may make it less efficient, especially on distributed memory systems. In this work we propose an asynchronous work stealing (AsynchWS) strategy which exploits opportunities to overlap communication with local residual tasks. Profiling information is collected locally to optimize task granularity and guide the asynchronous work stealing. AsynchWS is implemented in Unified Parallel C (UPC), which effectively supports non-blocking one-sided communication and facilitates the implementation. Experiments are conducted on a 32 nodes Xeon X5650 cluster using a set of irregular applications. Results show that up to 16% better performance than the state-of-the-art strategies on distributed memory.","PeriodicalId":202977,"journal":{"name":"2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Asynchronous Work Stealing on Distributed Memory Systems\",\"authors\":\"Shigang Li, Jingyuan Hu, Xin Cheng, Chongchong Zhao\",\"doi\":\"10.1109/PDP.2013.35\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Work stealing is a popular policy for dynamic load balancing of irregular applications. However, communication overhead incurred by work stealing may make it less efficient, especially on distributed memory systems. In this work we propose an asynchronous work stealing (AsynchWS) strategy which exploits opportunities to overlap communication with local residual tasks. Profiling information is collected locally to optimize task granularity and guide the asynchronous work stealing. AsynchWS is implemented in Unified Parallel C (UPC), which effectively supports non-blocking one-sided communication and facilitates the implementation. Experiments are conducted on a 32 nodes Xeon X5650 cluster using a set of irregular applications. Results show that up to 16% better performance than the state-of-the-art strategies on distributed memory.\",\"PeriodicalId\":202977,\"journal\":{\"name\":\"2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDP.2013.35\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2013.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

工作窃取是不规则应用程序动态负载平衡的常用策略。然而,工作窃取带来的通信开销可能会降低效率,尤其是在分布式内存系统上。在这项工作中,我们提出了一种异步工作窃取(AsynchWS)策略,该策略利用与本地剩余任务重叠通信的机会。在本地收集分析信息,以优化任务粒度并指导异步工作窃取。AsynchWS采用统一并行C (Unified Parallel C, UPC)语言实现,有效支持非阻塞的单向通信,便于实现。在32个节点的Xeon X5650集群上使用一组不规则应用程序进行了实验。结果表明,在分布式内存上,它的性能比最先进的策略提高了16%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Asynchronous Work Stealing on Distributed Memory Systems
Work stealing is a popular policy for dynamic load balancing of irregular applications. However, communication overhead incurred by work stealing may make it less efficient, especially on distributed memory systems. In this work we propose an asynchronous work stealing (AsynchWS) strategy which exploits opportunities to overlap communication with local residual tasks. Profiling information is collected locally to optimize task granularity and guide the asynchronous work stealing. AsynchWS is implemented in Unified Parallel C (UPC), which effectively supports non-blocking one-sided communication and facilitates the implementation. Experiments are conducted on a 32 nodes Xeon X5650 cluster using a set of irregular applications. Results show that up to 16% better performance than the state-of-the-art strategies on distributed memory.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信