EM-4数据流并行超级计算机上数据并行原语的性能

SupercomputerAndrew Shaw, Yuetsu Kodamaz, Mitsuhisa Satoz, Shuichi Sakaiz, Yoshinori YamaguchizyMIT
{"title":"EM-4数据流并行超级计算机上数据并行原语的性能","authors":"SupercomputerAndrew Shaw, Yuetsu Kodamaz, Mitsuhisa Satoz, Shuichi Sakaiz, Yoshinori YamaguchizyMIT","doi":"10.1109/FMPC.1992.234945","DOIUrl":null,"url":null,"abstract":"The authors have implemented seven data-parallel primitives on the hybrid dataflow/von Neumann parallel computer EM-4. To evaluate the performance of these primitives, the authors compare them to identical primitives running on a CM-200 SIMD (single-instruction multiple-data) parallel computer. For integer arithmetic element-wise operations, EM-4 is faster than the CM-200 when two or more operations can be combined. For communications operations, EM-4 has significantly higher performance. EM-4's distinguishing feature in running data-parallel codes is its exceptional communications performance in terms of network bandwidth and latency, and processor/network interface. Additional special-purpose hardware for barrier synchronization and scan-like operations is not necessary. Dataflow-style token synchronization is helpful, but not necessary in implementing data-parallel primitives.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Performance of data-parallel primitives on the EM-4 dataflow parallel supercomputer\",\"authors\":\"SupercomputerAndrew Shaw, Yuetsu Kodamaz, Mitsuhisa Satoz, Shuichi Sakaiz, Yoshinori YamaguchizyMIT\",\"doi\":\"10.1109/FMPC.1992.234945\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors have implemented seven data-parallel primitives on the hybrid dataflow/von Neumann parallel computer EM-4. To evaluate the performance of these primitives, the authors compare them to identical primitives running on a CM-200 SIMD (single-instruction multiple-data) parallel computer. For integer arithmetic element-wise operations, EM-4 is faster than the CM-200 when two or more operations can be combined. For communications operations, EM-4 has significantly higher performance. EM-4's distinguishing feature in running data-parallel codes is its exceptional communications performance in terms of network bandwidth and latency, and processor/network interface. Additional special-purpose hardware for barrier synchronization and scan-like operations is not necessary. Dataflow-style token synchronization is helpful, but not necessary in implementing data-parallel primitives.<<ETX>>\",\"PeriodicalId\":117789,\"journal\":{\"name\":\"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1992-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FMPC.1992.234945\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FMPC.1992.234945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

作者在混合数据流/冯诺依曼并行计算机EM-4上实现了7个数据并行原语。为了评估这些原语的性能,作者将它们与在CM-200 SIMD(单指令多数据)并行计算机上运行的相同原语进行了比较。对于整数算术元素操作,当可以组合两个或多个操作时,EM-4比CM-200快。对于通信操作,EM-4具有显著更高的性能。EM-4在运行数据并行代码方面的显著特点是其在网络带宽和延迟以及处理器/网络接口方面的卓越通信性能。对于屏障同步和类似扫描的操作,不需要额外的专用硬件。数据流风格的令牌同步是有帮助的,但在实现数据并行原语时不是必需的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Performance of data-parallel primitives on the EM-4 dataflow parallel supercomputer
The authors have implemented seven data-parallel primitives on the hybrid dataflow/von Neumann parallel computer EM-4. To evaluate the performance of these primitives, the authors compare them to identical primitives running on a CM-200 SIMD (single-instruction multiple-data) parallel computer. For integer arithmetic element-wise operations, EM-4 is faster than the CM-200 when two or more operations can be combined. For communications operations, EM-4 has significantly higher performance. EM-4's distinguishing feature in running data-parallel codes is its exceptional communications performance in terms of network bandwidth and latency, and processor/network interface. Additional special-purpose hardware for barrier synchronization and scan-like operations is not necessary. Dataflow-style token synchronization is helpful, but not necessary in implementing data-parallel primitives.<>
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信