Mixed mode matrix multiplication

Meng-Shiou Wu, S. Aluru, R. Kendall
{"title":"Mixed mode matrix multiplication","authors":"Meng-Shiou Wu, S. Aluru, R. Kendall","doi":"10.1109/CLUSTR.2002.1137747","DOIUrl":null,"url":null,"abstract":"In modern clustering environments where the memory hierarchy has many layers (distributed memory, shared memory layer, cache, ...), an important question is how to fully utilize all available resources and identify the most dominant layer in certain computation. When combining algorithms on all layers together, what would be the best method to get the best performance out of all the resources we have? The mixed mode programming model that uses thread programming on the shared memory layer and message passing programming on the distributed memory layer is a method that many researchers are using to utilize the memory resources. We take an algorithmic approach that uses matrix multiplication as a tool to show how cache algorithms affect the performance of both shared memory and distributed memory algorithms. We show that with good underlying cache algorithm, overall performance is stable. When the underlying cache algorithm is bad, superlinear speedup may occur and increasing number of threads may also improve performance.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":"102 1","pages":"195-203"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTR.2002.1137747","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

In modern clustering environments where the memory hierarchy has many layers (distributed memory, shared memory layer, cache, ...), an important question is how to fully utilize all available resources and identify the most dominant layer in certain computation. When combining algorithms on all layers together, what would be the best method to get the best performance out of all the resources we have? The mixed mode programming model that uses thread programming on the shared memory layer and message passing programming on the distributed memory layer is a method that many researchers are using to utilize the memory resources. We take an algorithmic approach that uses matrix multiplication as a tool to show how cache algorithms affect the performance of both shared memory and distributed memory algorithms. We show that with good underlying cache algorithm, overall performance is stable. When the underlying cache algorithm is bad, superlinear speedup may occur and increasing number of threads may also improve performance.
混合模式矩阵乘法
在现代集群环境中,内存层次结构有许多层(分布式内存、共享内存层、缓存等),一个重要的问题是如何充分利用所有可用资源,并在某些计算中确定最主要的层。当将所有层的算法组合在一起时,从我们拥有的所有资源中获得最佳性能的最佳方法是什么?在共享内存层上使用线程编程,在分布式内存层上使用消息传递编程的混合模式编程模型是目前许多研究人员利用内存资源的一种方法。我们采用一种算法方法,使用矩阵乘法作为工具来显示缓存算法如何影响共享内存和分布式内存算法的性能。我们证明了良好的底层缓存算法,整体性能是稳定的。当底层缓存算法不好时,可能会出现超线性加速,增加线程数量也可以提高性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信