Accelerating List Management for MPI

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-07-01 DOI:10.1109/CLUSTR.2005.347036

K. Underwood, Arun Rodrigues, K. S. Hemmeit

引用次数: 6

Abstract

The latency and throughput of MPI messages are critically important to a range of parallel scientific applications. In many modern networks, both of these performance characteristics are largely driven by the performance of a processor on the network interface. Because of the semantics of MPI, this embedded processor is forced to traverse a linked list of posted receives each time a messages is received. As this list grows long, the latency of message reception grows and the throughput of MPI messages decreases. This paper presents a novel hardware feature to handle list management functions on a network interface. By moving functions such as list insertion, list traversal, and list deletion to the hardware unit, latencies are decreased by up to 20% in the zero length queue case with dramatic improvements in the presence of long queues. Similarly, the throughput is increased by up to 10% in the zero length queue case and by nearly 100% in the presence queues of 30 messages

查看原文本刊更多论文

加速列表管理的MPI

MPI消息的延迟和吞吐量对一系列并行科学应用至关重要。在许多现代网络中，这两种性能特征在很大程度上取决于网络接口上处理器的性能。由于MPI的语义，每次接收消息时，这个嵌入式处理器都必须遍历已发布的接收链表。随着这个列表变长，消息接收的延迟会增加，MPI消息的吞吐量会降低。本文提出了一种新的硬件特性来处理网络接口上的列表管理功能。通过将列表插入、列表遍历和列表删除等功能移动到硬件单元，在零长度队列的情况下，延迟减少了20%，在存在长队列的情况下，延迟得到了显著改善。类似地，在零长度队列情况下，吞吐量最多增加10%，在30条消息的存在队列中，吞吐量增加近100%

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2005 IEEE International Conference on Cluster Computing

自引率

0.00%

发文量