选择性全球化:具有地址空间分离的共享内存效率

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI:10.1145/2503210.2503275

N. Mahajan, Uday Pitambare, A. Chauhan

{"title":"选择性全球化:具有地址空间分离的共享内存效率","authors":"N. Mahajan, Uday Pitambare, A. Chauhan","doi":"10.1145/2503210.2503275","DOIUrl":null,"url":null,"abstract":"It has become common for MPI-based applications to run on shared-memory machines. However, MPI semantics do not allow leveraging shared memory fully for communication between processes from within the MPI library. This paper presents an approach that combines compiler transformations with a specialized runtime system to achieve zero-copy communication whenever possible by proving certain properties statically and globalizing data selectively by altering the allocation and deallocation of communication buffers. The runtime system provides dynamic optimization, when such proofs are not possible statically, by copying data only when there are write-write or read-write conflicts. We implemented a prototype compiler, using ROSE, and evaluated it on several benchmarks. Our system produces code that performs better than MPI in most cases and no worse than MPI, tuned for shared memory, in all cases.","PeriodicalId":371074,"journal":{"name":"2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Globalizing selectively: Shared-memory efficiency with address-space separation\",\"authors\":\"N. Mahajan, Uday Pitambare, A. Chauhan\",\"doi\":\"10.1145/2503210.2503275\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It has become common for MPI-based applications to run on shared-memory machines. However, MPI semantics do not allow leveraging shared memory fully for communication between processes from within the MPI library. This paper presents an approach that combines compiler transformations with a specialized runtime system to achieve zero-copy communication whenever possible by proving certain properties statically and globalizing data selectively by altering the allocation and deallocation of communication buffers. The runtime system provides dynamic optimization, when such proofs are not possible statically, by copying data only when there are write-write or read-write conflicts. We implemented a prototype compiler, using ROSE, and evaluated it on several benchmarks. Our system produces code that performs better than MPI in most cases and no worse than MPI, tuned for shared memory, in all cases.\",\"PeriodicalId\":371074,\"journal\":{\"name\":\"2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2503210.2503275\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2503210.2503275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

基于mpi的应用程序在共享内存机器上运行已经变得很常见。但是，MPI语义不允许在MPI库中的进程之间充分利用共享内存进行通信。本文提出了一种将编译器转换与专门的运行时系统相结合的方法，通过静态地证明某些属性和通过改变通信缓冲区的分配和释放来选择性地全球化数据，从而在可能的情况下实现零复制通信。运行时系统通过仅在存在write-write或read-write冲突时复制数据来提供动态优化，当静态地无法进行此类证明时。我们使用ROSE实现了一个原型编译器，并在几个基准测试中对其进行了评估。我们的系统生成的代码在大多数情况下都比MPI执行得好，并且在所有情况下都不会比MPI更差(针对共享内存进行了调优)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Globalizing selectively: Shared-memory efficiency with address-space separation

It has become common for MPI-based applications to run on shared-memory machines. However, MPI semantics do not allow leveraging shared memory fully for communication between processes from within the MPI library. This paper presents an approach that combines compiler transformations with a specialized runtime system to achieve zero-copy communication whenever possible by proving certain properties statically and globalizing data selectively by altering the allocation and deallocation of communication buffers. The runtime system provides dynamic optimization, when such proofs are not possible statically, by copying data only when there are write-write or read-write conflicts. We implemented a prototype compiler, using ROSE, and evaluated it on several benchmarks. Our system produces code that performs better than MPI in most cases and no worse than MPI, tuned for shared memory, in all cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)

自引率

0.00%

发文量