Yuetsu Kodama, H. Sakane, H. Koike, M. Sato, S. Sakai, Y. Yamaguchi
{"title":"采用细粒度通信的基数排序程序并行执行","authors":"Yuetsu Kodama, H. Sakane, H. Koike, M. Sato, S. Sakai, Y. Yamaguchi","doi":"10.1109/PACT.1997.644010","DOIUrl":null,"url":null,"abstract":"The report presents empirical results of fine-grain communication on the 80-processor EM-X distributed-memory multiprocessor. EM-X has hardware support for low latency, high throughput fine-grain communication-this hardware support includes packet generation integrated into the instruction execution pipeline for single-cycle communication overhead, direct memory access for remote references, and rapid context switching for latency tolerance. The authors study the fine-grain communication performance of integer radix sort, a code with irregular communication, on EM-X, and compare it to the Fujitsu AP1000+ and the Cray Server CS6400. The experimental results indicate that EM-X achieves high throughput and low overhead for fine-grain communication. Whereas EM-X's communication performance scales perfectly as one increases the number of processors, other coarse-grain message-passing machines exhibit fluctuation and performance degradation for larger configurations due to network contention.","PeriodicalId":177411,"journal":{"name":"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Parallel execution of radix sort program using fine-grain communication\",\"authors\":\"Yuetsu Kodama, H. Sakane, H. Koike, M. Sato, S. Sakai, Y. Yamaguchi\",\"doi\":\"10.1109/PACT.1997.644010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The report presents empirical results of fine-grain communication on the 80-processor EM-X distributed-memory multiprocessor. EM-X has hardware support for low latency, high throughput fine-grain communication-this hardware support includes packet generation integrated into the instruction execution pipeline for single-cycle communication overhead, direct memory access for remote references, and rapid context switching for latency tolerance. The authors study the fine-grain communication performance of integer radix sort, a code with irregular communication, on EM-X, and compare it to the Fujitsu AP1000+ and the Cray Server CS6400. The experimental results indicate that EM-X achieves high throughput and low overhead for fine-grain communication. Whereas EM-X's communication performance scales perfectly as one increases the number of processors, other coarse-grain message-passing machines exhibit fluctuation and performance degradation for larger configurations due to network contention.\",\"PeriodicalId\":177411,\"journal\":{\"name\":\"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACT.1997.644010\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACT.1997.644010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
本文给出了在80处理器EM-X分布式内存多处理器上实现细粒度通信的实证结果。EM-X具有对低延迟、高吞吐量细粒度通信的硬件支持——这种硬件支持包括集成到指令执行管道中的数据包生成,用于单周期通信开销,用于远程引用的直接内存访问,以及用于延迟容忍的快速上下文切换。研究了不规则通信的整数基数排序码在EM-X上的细粒度通信性能,并与富士通AP1000+和Cray Server CS6400进行了比较。实验结果表明,EM-X实现了高吞吐量和低开销的细粒度通信。EM-X的通信性能随着处理器数量的增加而完美扩展,而其他粗粒度消息传递机器由于网络争用而在较大配置下表现出波动和性能下降。
Parallel execution of radix sort program using fine-grain communication
The report presents empirical results of fine-grain communication on the 80-processor EM-X distributed-memory multiprocessor. EM-X has hardware support for low latency, high throughput fine-grain communication-this hardware support includes packet generation integrated into the instruction execution pipeline for single-cycle communication overhead, direct memory access for remote references, and rapid context switching for latency tolerance. The authors study the fine-grain communication performance of integer radix sort, a code with irregular communication, on EM-X, and compare it to the Fujitsu AP1000+ and the Cray Server CS6400. The experimental results indicate that EM-X achieves high throughput and low overhead for fine-grain communication. Whereas EM-X's communication performance scales perfectly as one increases the number of processors, other coarse-grain message-passing machines exhibit fluctuation and performance degradation for larger configurations due to network contention.