Dynamic percolation: a case of study on the shortcomings of traditional optimization in many-core architectures

E. Garcia, Daniel A. Orozco, R. Khan, Ioannis E. Venetis, Kelly Livingston, G. Gao
{"title":"Dynamic percolation: a case of study on the shortcomings of traditional optimization in many-core architectures","authors":"E. Garcia, Daniel A. Orozco, R. Khan, Ioannis E. Venetis, Kelly Livingston, G. Gao","doi":"10.1145/2212908.2212944","DOIUrl":null,"url":null,"abstract":"This paper provides a discussion on the shortcomings of traditional static optimization techniques when used in the context of many-core architectures. We argue that these shortcomings are a result of the significantly different environment found in many-cores. We analyze previous attempts at optimization of Dense Matrix Multiplication (DMM) that failed to achieve high performance despite extensive efforts towards optimization.\n We have found that percolation (prefetching data) and scheduling play a central role in the performance of applications. To overcome those difficulties, we have (1) fused dynamic scheduling and percolation into a dynamic percolation approach and (2) we have added additional percolation operations. Our new techniques enabled us to increase the performance of the application in our study from 44 GFLOPS (out of 80 GFLOPS possible) to 70.0 GFLOPS (operands in SRAM) or 65.6 GFLOPS (operands in DRAM).","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"124 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2212908.2212944","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

This paper provides a discussion on the shortcomings of traditional static optimization techniques when used in the context of many-core architectures. We argue that these shortcomings are a result of the significantly different environment found in many-cores. We analyze previous attempts at optimization of Dense Matrix Multiplication (DMM) that failed to achieve high performance despite extensive efforts towards optimization. We have found that percolation (prefetching data) and scheduling play a central role in the performance of applications. To overcome those difficulties, we have (1) fused dynamic scheduling and percolation into a dynamic percolation approach and (2) we have added additional percolation operations. Our new techniques enabled us to increase the performance of the application in our study from 44 GFLOPS (out of 80 GFLOPS possible) to 70.0 GFLOPS (operands in SRAM) or 65.6 GFLOPS (operands in DRAM).
动态渗透:研究传统优化在多核体系结构中的不足
本文讨论了传统静态优化技术在多核架构环境下的缺点。我们认为这些缺点是在多核中发现的显着不同的环境的结果。我们分析了之前在密集矩阵乘法(DMM)优化方面的尝试,尽管在优化方面做了大量的努力,但未能实现高性能。我们发现,渗透(预取数据)和调度在应用程序的性能中起着核心作用。为了克服这些困难,我们(1)将动态调度和渗透融合到一个动态渗透方法中,(2)我们增加了额外的渗透操作。我们的新技术使我们能够将我们研究中的应用程序的性能从44 GFLOPS(可能的80 GFLOPS)提高到70.0 GFLOPS (SRAM中的操作数)或65.6 GFLOPS (DRAM中的操作数)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信