The Benefits of Clustering in Shared Address Space Multiprocessors: An Applications-Driven Investigation

Andrew Erlichson, B. A. Nayfeh, J. Singh, K. Olukotun
{"title":"The Benefits of Clustering in Shared Address Space Multiprocessors: An Applications-Driven Investigation","authors":"Andrew Erlichson, B. A. Nayfeh, J. Singh, K. Olukotun","doi":"10.1145/224170.224397","DOIUrl":null,"url":null,"abstract":"Clustering processors together at a level of the memory hierarchy in shared address space multiprocessors appears to be an attractive technique from several standpoints: Resources are shared, packaging technologies are exploited, and processors within a cluster can share data more effectively. We investigate the performance benefits that can be obtained by clustering on a range of important scientific and engineering applications in moderate to large scale cache coherent machines with small degrees of clustering (up to one eighth of the total number of processors in a cluster). We find that except for applications with near neighbor communication topologies this degree of clustering is not very effective in reducing the inherent communication to computation ratios. Clustering is more useful in reducing the the number of remote capacity misses in unstructured applications, and can improve performance substantially when small first-level caches are clustered in these cases. This suggests that clustering at the first level cache might be useful in highly-integrated, relatively fine-grained environments. For less integrated machines such as current distributed shared memory multiprocessors, our results suggest that clustering at the first-level caches is not very useful in improving application performance; however our results also suggest that in an machine with long interprocessor communication latencies, clustering further away from the processor can provide performance benefits.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the IEEE/ACM SC95 Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/224170.224397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 31

Abstract

Clustering processors together at a level of the memory hierarchy in shared address space multiprocessors appears to be an attractive technique from several standpoints: Resources are shared, packaging technologies are exploited, and processors within a cluster can share data more effectively. We investigate the performance benefits that can be obtained by clustering on a range of important scientific and engineering applications in moderate to large scale cache coherent machines with small degrees of clustering (up to one eighth of the total number of processors in a cluster). We find that except for applications with near neighbor communication topologies this degree of clustering is not very effective in reducing the inherent communication to computation ratios. Clustering is more useful in reducing the the number of remote capacity misses in unstructured applications, and can improve performance substantially when small first-level caches are clustered in these cases. This suggests that clustering at the first level cache might be useful in highly-integrated, relatively fine-grained environments. For less integrated machines such as current distributed shared memory multiprocessors, our results suggest that clustering at the first-level caches is not very useful in improving application performance; however our results also suggest that in an machine with long interprocessor communication latencies, clustering further away from the processor can provide performance benefits.
共享地址空间多处理器集群的优势:一项应用驱动的研究
在共享地址空间多处理器的内存层次结构中,将处理器聚集在一起从几个角度来看似乎是一种有吸引力的技术:资源是共享的,封装技术得到利用,集群中的处理器可以更有效地共享数据。我们研究了在具有小集群程度(最多集群中处理器总数的八分之一)的中型到大型缓存一致机器上对一系列重要的科学和工程应用程序进行集群所能获得的性能优势。我们发现,除了具有近邻通信拓扑的应用程序外,这种程度的聚类在降低固有通信与计算比方面不是很有效。集群在减少非结构化应用程序中远程容量丢失的数量方面更有用,并且在这些情况下,当小型一级缓存集群时,可以大大提高性能。这表明,第一级缓存的集群在高度集成、相对细粒度的环境中可能很有用。对于集成度较低的机器,如当前的分布式共享内存多处理器,我们的结果表明,在一级缓存上集群对提高应用程序性能不是很有用;然而,我们的结果还表明,在处理器间通信延迟较长的机器中,远离处理器的集群可以提供性能优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信