{"title":"CRL:高性能全软件分布式共享内存","authors":"Kirk L. Johnson, M. Kaashoek, D. Wallach","doi":"10.1145/224056.224073","DOIUrl":null,"url":null,"abstract":"The C Region Library (CRL) is a new all-software distributed shared memory (DSM) system. CRL requires no special compiler, hardware, or operating system support beyond the ability to send and receive messages between processing nodes. It provides a simple, portable, region-based shared address space programming model that is capable of delivering good performance on a wide range of multiprocessor and distributed system architectures. Each region is an arbitrarily sized, contiguous area of memory. The programmer defines regions and delimits accesses to them using annotations. \nCRL implementations have been developed for two platforms: the Thinking Machines CM-5, a commercial multicomputer, and the MIT Alewife machine, an experimental multiprocessor offering efficient hardware support for both message passing and shared memory. Results are presented for up to 128 processors on the CM-5 and up to 32 processors on Alewife. \nUsing Alewife as a vehicle, this thesis presents results from the first completely controlled comparison of scalable hardware and software DSM systems. These results indicate that CRL is capable of delivering performance that is competitive with hardware DSM systems: CRL achieves speedups within 15% of those provided by Alewife's native hardware-supported shared memory, even for challenging applications (e.g., Barnes-Hut) and small problem sizes. \nA second set of experimental results provides insight into the sensitivity of CRL's performance to increased communication costs (both higher latency and lower bandwidth). These results demonstrate that even for relatively challenging applications, CRL should be capable of delivering reasonable performance on current-generation distributed systems. \nTaken together, these results indicate the substantial promise of CRL and other all-software approaches to providing shared memory functionality and suggest that in many cases special-purpose hardware support for shared memory may not be necessary. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)","PeriodicalId":168455,"journal":{"name":"Proceedings of the fifteenth ACM symposium on Operating systems principles","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"277","resultStr":"{\"title\":\"CRL: high-performance all-software distributed shared memory\",\"authors\":\"Kirk L. Johnson, M. Kaashoek, D. Wallach\",\"doi\":\"10.1145/224056.224073\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The C Region Library (CRL) is a new all-software distributed shared memory (DSM) system. CRL requires no special compiler, hardware, or operating system support beyond the ability to send and receive messages between processing nodes. It provides a simple, portable, region-based shared address space programming model that is capable of delivering good performance on a wide range of multiprocessor and distributed system architectures. Each region is an arbitrarily sized, contiguous area of memory. The programmer defines regions and delimits accesses to them using annotations. \\nCRL implementations have been developed for two platforms: the Thinking Machines CM-5, a commercial multicomputer, and the MIT Alewife machine, an experimental multiprocessor offering efficient hardware support for both message passing and shared memory. Results are presented for up to 128 processors on the CM-5 and up to 32 processors on Alewife. \\nUsing Alewife as a vehicle, this thesis presents results from the first completely controlled comparison of scalable hardware and software DSM systems. These results indicate that CRL is capable of delivering performance that is competitive with hardware DSM systems: CRL achieves speedups within 15% of those provided by Alewife's native hardware-supported shared memory, even for challenging applications (e.g., Barnes-Hut) and small problem sizes. \\nA second set of experimental results provides insight into the sensitivity of CRL's performance to increased communication costs (both higher latency and lower bandwidth). These results demonstrate that even for relatively challenging applications, CRL should be capable of delivering reasonable performance on current-generation distributed systems. \\nTaken together, these results indicate the substantial promise of CRL and other all-software approaches to providing shared memory functionality and suggest that in many cases special-purpose hardware support for shared memory may not be necessary. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)\",\"PeriodicalId\":168455,\"journal\":{\"name\":\"Proceedings of the fifteenth ACM symposium on Operating systems principles\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"277\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the fifteenth ACM symposium on Operating systems principles\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/224056.224073\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the fifteenth ACM symposium on Operating systems principles","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/224056.224073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 277
摘要
CRL (C Region Library)是一种新的全软件分布式共享内存(DSM)系统。除了能够在处理节点之间发送和接收消息之外,CRL不需要特殊的编译器、硬件或操作系统支持。它提供了一个简单的、可移植的、基于区域的共享地址空间编程模型,能够在广泛的多处理器和分布式系统架构上提供良好的性能。每个区域是一个任意大小的、连续的内存区域。程序员定义区域并使用注释分隔对它们的访问。CRL实现是为两个平台开发的:Thinking Machines CM-5(一种商用多计算机)和MIT Alewife(一种实验性多处理器,为消息传递和共享内存提供高效的硬件支持)。CM-5上最多128个处理器和Alewife上最多32个处理器的结果。本文以Alewife为载体,首次对可扩展的硬件和软件DSM系统进行了完全可控的比较。这些结果表明,CRL能够提供与硬件DSM系统相竞争的性能:即使对于具有挑战性的应用程序(例如Barnes-Hut)和小问题,CRL的加速速度也比Alewife的本地硬件支持的共享内存提高了15%。第二组实验结果揭示了CRL性能对增加的通信成本(更高的延迟和更低的带宽)的敏感性。这些结果表明,即使对于相对具有挑战性的应用程序,CRL也应该能够在当前的分布式系统上提供合理的性能。综上所述,这些结果表明CRL和其他全软件方法在提供共享内存功能方面大有希望,并表明在许多情况下,可能没有必要为共享内存提供专用硬件支持。(麻省理工学院图书馆独家提供,14-0551室,剑桥,马萨诸塞州02139-4307。博士617-253-5668;传真617-253-1690。)
The C Region Library (CRL) is a new all-software distributed shared memory (DSM) system. CRL requires no special compiler, hardware, or operating system support beyond the ability to send and receive messages between processing nodes. It provides a simple, portable, region-based shared address space programming model that is capable of delivering good performance on a wide range of multiprocessor and distributed system architectures. Each region is an arbitrarily sized, contiguous area of memory. The programmer defines regions and delimits accesses to them using annotations.
CRL implementations have been developed for two platforms: the Thinking Machines CM-5, a commercial multicomputer, and the MIT Alewife machine, an experimental multiprocessor offering efficient hardware support for both message passing and shared memory. Results are presented for up to 128 processors on the CM-5 and up to 32 processors on Alewife.
Using Alewife as a vehicle, this thesis presents results from the first completely controlled comparison of scalable hardware and software DSM systems. These results indicate that CRL is capable of delivering performance that is competitive with hardware DSM systems: CRL achieves speedups within 15% of those provided by Alewife's native hardware-supported shared memory, even for challenging applications (e.g., Barnes-Hut) and small problem sizes.
A second set of experimental results provides insight into the sensitivity of CRL's performance to increased communication costs (both higher latency and lower bandwidth). These results demonstrate that even for relatively challenging applications, CRL should be capable of delivering reasonable performance on current-generation distributed systems.
Taken together, these results indicate the substantial promise of CRL and other all-software approaches to providing shared memory functionality and suggest that in many cases special-purpose hardware support for shared memory may not be necessary. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)