Ownership passing: efficient distributed memory programming on multi-core systems

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2013-02-23 DOI:10.1145/2442516.2442534

A. Friedley, T. Hoefler, G. Bronevetsky, A. Lumsdaine, Ching-Chen Ma

{"title":"Ownership passing: efficient distributed memory programming on multi-core systems","authors":"A. Friedley, T. Hoefler, G. Bronevetsky, A. Lumsdaine, Ching-Chen Ma","doi":"10.1145/2442516.2442534","DOIUrl":null,"url":null,"abstract":"The number of cores in multi- and many-core high-performance processors is steadily increasing. MPI, the de-facto standard for programming high-performance computing systems offers a distributed memory programming model. MPI's semantics force a copy from one process' send buffer to another process' receive buffer. This makes it difficult to achieve the same performance on modern hardware than shared memory programs which are arguably harder to maintain and debug. We propose generalizing MPI's communication model to include ownership passing, which make it possible to fully leverage the shared memory hardware of multi- and many-core CPUs to stream communicated data concurrently with the receiver's computations on it. The benefits and simplicity of message passing are retained by extending MPI with calls to send (pass) ownership of memory regions, instead of their contents, between processes. Ownership passing is achieved with a hybrid MPI implementation that runs MPI processes as threads and is mostly transparent to the user. We propose an API and a static analysis technique to transform legacy MPI codes automatically and transparently to the programmer, demonstrating that this scheme is easy to use in practice. Using the ownership passing technique, we see up to 51% communication speedups over a standard message passing implementation on state-of-the art multicore systems. Our analysis and interface will lay the groundwork for future development of MPI-aware optimizing compilers and multi-core specific optimizations, which will be key for success in current and next-generation computing platforms.","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2442516.2442534","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

Abstract

The number of cores in multi- and many-core high-performance processors is steadily increasing. MPI, the de-facto standard for programming high-performance computing systems offers a distributed memory programming model. MPI's semantics force a copy from one process' send buffer to another process' receive buffer. This makes it difficult to achieve the same performance on modern hardware than shared memory programs which are arguably harder to maintain and debug. We propose generalizing MPI's communication model to include ownership passing, which make it possible to fully leverage the shared memory hardware of multi- and many-core CPUs to stream communicated data concurrently with the receiver's computations on it. The benefits and simplicity of message passing are retained by extending MPI with calls to send (pass) ownership of memory regions, instead of their contents, between processes. Ownership passing is achieved with a hybrid MPI implementation that runs MPI processes as threads and is mostly transparent to the user. We propose an API and a static analysis technique to transform legacy MPI codes automatically and transparently to the programmer, demonstrating that this scheme is easy to use in practice. Using the ownership passing technique, we see up to 51% communication speedups over a standard message passing implementation on state-of-the art multicore systems. Our analysis and interface will lay the groundwork for future development of MPI-aware optimizing compilers and multi-core specific optimizations, which will be key for success in current and next-generation computing platforms.

查看原文本刊更多论文

所有权传递:多核系统上高效的分布式内存编程

多核和多核高性能处理器的核心数量正在稳步增长。MPI是高性能计算系统编程的事实标准，它提供了一种分布式内存编程模型。MPI的语义强制一个副本从一个进程的发送缓冲区到另一个进程的接收缓冲区。这使得在现代硬件上实现与共享内存程序相同的性能变得困难，共享内存程序更难以维护和调试。我们提出将MPI的通信模型一般化，使其包含所有权传递，从而可以充分利用多核和多核cpu的共享内存硬件，在传输数据的同时进行接收方的计算。通过在进程之间调用发送(传递)内存区域的所有权(而不是其内容)来扩展MPI，保留了消息传递的好处和简单性。所有权传递是通过混合MPI实现实现的，该实现将MPI进程作为线程运行，并且对用户基本透明。我们提出了一个API和一个静态分析技术来自动地、透明地转换遗留的MPI代码，并证明了该方案在实践中易于使用。使用所有权传递技术，我们看到在最先进的多核系统上，与标准消息传递实现相比，通信速度提高了51%。我们的分析和接口将为mpi感知优化编译器和多核特定优化的未来发展奠定基础，这将是当前和下一代计算平台成功的关键。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming

自引率

0.00%

发文量