DiMP: Architectural Support for Direct Message Passing on Shared Memory Multi-cores

2015 44th International Conference on Parallel Processing Pub Date : 2015-09-01 DOI:10.1109/ICPP.2015.22

Rubén Titos-Gil, Oscar Palomar, O. Unsal, A. Cristal

{"title":"DiMP: Architectural Support for Direct Message Passing on Shared Memory Multi-cores","authors":"Rubén Titos-Gil, Oscar Palomar, O. Unsal, A. Cristal","doi":"10.1109/ICPP.2015.22","DOIUrl":null,"url":null,"abstract":"Thanks to programming approaches like actor-based models, message passing is regaining popularity outside large-scale scientific computing for building scalable distributed applications in many-core processors. Unfortunately, the mismatch between message passing models and today's shared-memory hardware provided by commercial vendors results in suboptimal performance and loss of efficiency. This paper presents a set of architectural extensions to reduce the overheads incurred by message passing workloads running on shared memory multi-core architectures. It describes the instruction set extensions and the hardware implementation. In order to facilitate programmability, the proposed extensions are used by a message passing library, allowing programs to take advantage of them transparently. As a proof-of-concept, we use a modified MPICH library and MPI programs to evaluate the proposal. Experimental results show that, on average, our proposal spends 60% less cycles performing data transfers in MPI functions, and reduces the L1 data cache misses in said functions to a fourth.","PeriodicalId":423007,"journal":{"name":"2015 44th International Conference on Parallel Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 44th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2015.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Thanks to programming approaches like actor-based models, message passing is regaining popularity outside large-scale scientific computing for building scalable distributed applications in many-core processors. Unfortunately, the mismatch between message passing models and today's shared-memory hardware provided by commercial vendors results in suboptimal performance and loss of efficiency. This paper presents a set of architectural extensions to reduce the overheads incurred by message passing workloads running on shared memory multi-core architectures. It describes the instruction set extensions and the hardware implementation. In order to facilitate programmability, the proposed extensions are used by a message passing library, allowing programs to take advantage of them transparently. As a proof-of-concept, we use a modified MPICH library and MPI programs to evaluate the proposal. Experimental results show that, on average, our proposal spends 60% less cycles performing data transfers in MPI functions, and reduces the L1 data cache misses in said functions to a fourth.

查看原文本刊更多论文

DiMP:在共享内存多核上传递直接消息的架构支持

由于基于角色的模型等编程方法，消息传递在大规模科学计算之外重新流行起来，用于在多核处理器中构建可伸缩的分布式应用程序。不幸的是，消息传递模型与目前商业供应商提供的共享内存硬件之间的不匹配会导致性能不理想和效率损失。本文提出了一组体系结构扩展，以减少在共享内存多核体系结构上运行的消息传递工作负载所带来的开销。介绍了指令集扩展和硬件实现。为了促进可编程性，建议的扩展由消息传递库使用，允许程序透明地利用它们。作为概念验证，我们使用修改后的MPICH库和MPI程序来评估该提案。实验结果表明，平均而言，我们的建议在MPI函数中减少了60%的数据传输周期，并将所述函数中的L1数据缓存丢失减少到四分之一。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 44th International Conference on Parallel Processing

自引率

0.00%

发文量