Ameliorating memory contention of OLAP operators on GPU processors

Evangelia A. Sitaridi, K. A. Ross
{"title":"Ameliorating memory contention of OLAP operators on GPU processors","authors":"Evangelia A. Sitaridi, K. A. Ross","doi":"10.1145/2236584.2236590","DOIUrl":null,"url":null,"abstract":"Implementations of database operators on GPU processors have shown dramatic performance improvement compared to multicore-CPU implementations. GPU threads can cooperate using shared memory, which is organized in interleaved banks and is fast only when threads read and modify addresses belonging to distinct memory banks. Therefore, data processing operators implemented on a GPU, in addition to contention caused by popular values, have to deal with a new performance limiting factor: thread serialization when accessing values belonging to the same bank.\n Here, we define the problem of bank and value conflict optimization for data processing operators using the CUDA platform. To analyze the impact of these two factors on operator performance we use two database operations: foreignkey join and grouped aggregation. We suggest and evaluate techniques for optimizing the data arrangement offline by creating clones of values to reduce overall memory contention. Results indicate that columns used for writes, as grouping columns, need be optimized to fully exploit the maximum bandwidth of shared memory.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Data Management on New Hardware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2236584.2236590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29

Abstract

Implementations of database operators on GPU processors have shown dramatic performance improvement compared to multicore-CPU implementations. GPU threads can cooperate using shared memory, which is organized in interleaved banks and is fast only when threads read and modify addresses belonging to distinct memory banks. Therefore, data processing operators implemented on a GPU, in addition to contention caused by popular values, have to deal with a new performance limiting factor: thread serialization when accessing values belonging to the same bank. Here, we define the problem of bank and value conflict optimization for data processing operators using the CUDA platform. To analyze the impact of these two factors on operator performance we use two database operations: foreignkey join and grouped aggregation. We suggest and evaluate techniques for optimizing the data arrangement offline by creating clones of values to reduce overall memory contention. Results indicate that columns used for writes, as grouping columns, need be optimized to fully exploit the maximum bandwidth of shared memory.
改进GPU处理器上OLAP操作符的内存争用
与多核cpu实现相比,在GPU处理器上实现数据库运算符显示出显著的性能改进。GPU线程可以使用共享内存进行合作,共享内存被组织在交错的内存库中,只有当线程读取和修改属于不同内存库的地址时才会快速。因此,在GPU上实现的数据处理操作符,除了流行值引起的争用之外,还必须处理一个新的性能限制因素:访问属于同一银行的值时的线程序列化。在这里,我们定义了使用CUDA平台的数据处理算子的bank和value冲突优化问题。为了分析这两个因素对操作符性能的影响,我们使用两个数据库操作:外键连接和分组聚合。我们建议并评估通过创建值的克隆来离线优化数据安排的技术,以减少总体内存争用。结果表明,需要对用于写的列(如分组列)进行优化,以充分利用共享内存的最大带宽。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信