面向库冲突缓解的GPU内存访问代数建模

2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI:10.1109/SiPS47522.2019.9020385

Luca Ferranti, J. Boutellier

{"title":"面向库冲突缓解的GPU内存访问代数建模","authors":"Luca Ferranti, J. Boutellier","doi":"10.1109/SiPS47522.2019.9020385","DOIUrl":null,"url":null,"abstract":"Graphics Processing Units (GPU) have been widely used in various fields of scientific computing, such as in signal processing. GPUs have a hierarchical memory structure with memory layers that are shared between GPU processing elements. Partly due to the complex memory hierarchy, GPU programming is non-trivial, and several aspects must be taken into account, one being memory access patterns. One of the fastest GPU memory layers, shared memory, is grouped into banks to enable fast, parallel access for processing elements. Unfortunately, it may happen that multiple threads of a GPU program may access the same shared memory bank simultaneously causing a bank conflict. If this happens, program execution slows down as memory accesses have to be rescheduled to determine which instruction to execute first. Bank conflicts are not taken into account automatically by the compiler, and hence the programmer must detect and deal with them prior to program execution. In this paper, we present an algebraic approach to detect bank conflicts and prove some theoretical results that can be used to predict when bank conflicts happen and how to avoid them. Also, our experimental results illustrate the savings in computation time.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards Algebraic Modeling of GPU Memory Access for Bank Conflict Mitigation\",\"authors\":\"Luca Ferranti, J. Boutellier\",\"doi\":\"10.1109/SiPS47522.2019.9020385\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graphics Processing Units (GPU) have been widely used in various fields of scientific computing, such as in signal processing. GPUs have a hierarchical memory structure with memory layers that are shared between GPU processing elements. Partly due to the complex memory hierarchy, GPU programming is non-trivial, and several aspects must be taken into account, one being memory access patterns. One of the fastest GPU memory layers, shared memory, is grouped into banks to enable fast, parallel access for processing elements. Unfortunately, it may happen that multiple threads of a GPU program may access the same shared memory bank simultaneously causing a bank conflict. If this happens, program execution slows down as memory accesses have to be rescheduled to determine which instruction to execute first. Bank conflicts are not taken into account automatically by the compiler, and hence the programmer must detect and deal with them prior to program execution. In this paper, we present an algebraic approach to detect bank conflicts and prove some theoretical results that can be used to predict when bank conflicts happen and how to avoid them. Also, our experimental results illustrate the savings in computation time.\",\"PeriodicalId\":256971,\"journal\":{\"name\":\"2019 IEEE International Workshop on Signal Processing Systems (SiPS)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Workshop on Signal Processing Systems (SiPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SiPS47522.2019.9020385\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SiPS47522.2019.9020385","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

图形处理单元(GPU)已广泛应用于科学计算的各个领域，如信号处理。GPU具有分层内存结构，其内存层在GPU处理元素之间共享。部分由于复杂的内存层次结构，GPU编程是非平凡的，必须考虑几个方面，其中一个是内存访问模式。最快的GPU内存层之一，共享内存，被分组到库中，以实现对处理元素的快速并行访问。不幸的是，一个GPU程序的多个线程可能同时访问同一个共享内存库，从而导致内存库冲突。如果发生这种情况，程序执行速度会变慢，因为必须重新调度内存访问，以确定首先执行哪条指令。编译器不会自动考虑Bank冲突，因此程序员必须在程序执行之前检测并处理它们。本文提出了一种检测银行冲突的代数方法，并证明了一些理论结果，这些结果可用于预测银行冲突何时发生以及如何避免银行冲突。此外，我们的实验结果表明，节省计算时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Towards Algebraic Modeling of GPU Memory Access for Bank Conflict Mitigation

Graphics Processing Units (GPU) have been widely used in various fields of scientific computing, such as in signal processing. GPUs have a hierarchical memory structure with memory layers that are shared between GPU processing elements. Partly due to the complex memory hierarchy, GPU programming is non-trivial, and several aspects must be taken into account, one being memory access patterns. One of the fastest GPU memory layers, shared memory, is grouped into banks to enable fast, parallel access for processing elements. Unfortunately, it may happen that multiple threads of a GPU program may access the same shared memory bank simultaneously causing a bank conflict. If this happens, program execution slows down as memory accesses have to be rescheduled to determine which instruction to execute first. Bank conflicts are not taken into account automatically by the compiler, and hence the programmer must detect and deal with them prior to program execution. In this paper, we present an algebraic approach to detect bank conflicts and prove some theoretical results that can be used to predict when bank conflicts happen and how to avoid them. Also, our experimental results illustrate the savings in computation time.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

自引率

0.00%

发文量