2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3)最新文献

筛选
英文 中文
Compressed In-memory Graphs for Accelerating GPU-based Analytics 用于加速基于gpu的分析的压缩内存图
2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3) Pub Date : 2022-11-01 DOI: 10.1109/IA356718.2022.00011
Noushin Azami, Martin Burtscher
{"title":"Compressed In-memory Graphs for Accelerating GPU-based Analytics","authors":"Noushin Azami, Martin Burtscher","doi":"10.1109/IA356718.2022.00011","DOIUrl":"https://doi.org/10.1109/IA356718.2022.00011","url":null,"abstract":"Processing large graphs has become an important irregular workload. We present Massively Parallel Log Graphs (MPLG) to accelerate GPU graph codes, including highly optimized codes. MPLG combines a compressed in-memory repre-sentation with low-overhead parallel decompression. This yields a speedup if the boost in memory performance due to the reduced footprint outweighs the overhead of the extra instructions to decompress the graph on the fly. However, achieving a sufficiently low overhead is difficult, especially on GPUs with their high-bandwidth memory. Prior work has only successfully employed similar ideas on CPUs, but those approaches exhibit limited parallelism, making them unsuitable for GPUs. On large real-world inputs, MPLG speeds up graph analytics by up to 67% on a Titan V GPU. Averaged over 15 graphs from several domains, it improves the performance of Rodinia's breadth-first search by 11.9%, Gardenia's connected components by 5.8%, and ECL's graph coloring by 5.0%.","PeriodicalId":144759,"journal":{"name":"2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128732202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Message from the IA3 22 Workshop Chairs 来自IA3 22个工作坊主席的信息
2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3) Pub Date : 2022-11-01 DOI: 10.1109/ia356718.2022.00004
{"title":"Message from the IA3 22 Workshop Chairs","authors":"","doi":"10.1109/ia356718.2022.00004","DOIUrl":"https://doi.org/10.1109/ia356718.2022.00004","url":null,"abstract":"","PeriodicalId":144759,"journal":{"name":"2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127090441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blocking Sparse Matrices to Leverage Dense-Specific Multiplication 阻塞稀疏矩阵以利用密集特定乘法
2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3) Pub Date : 2022-11-01 DOI: 10.1109/IA356718.2022.00009
P. S. Labini, M. Bernaschi, W. Nutt, Francesco Silvestri, Flavio Vella
{"title":"Blocking Sparse Matrices to Leverage Dense-Specific Multiplication","authors":"P. S. Labini, M. Bernaschi, W. Nutt, Francesco Silvestri, Flavio Vella","doi":"10.1109/IA356718.2022.00009","DOIUrl":"https://doi.org/10.1109/IA356718.2022.00009","url":null,"abstract":"Research to accelerate matrix multiplication, pushed by the growing computational demands of deep learning, has sprouted many efficient architectural solutions, such as NVIDIA's Tensor Cores. These accelerators are designed to process efficiently a high volume of small dense matrix products in parallel. However, it is not obvious how to leverage these accelerators for sparse matrix multiplication. A natural way to adapt the accelerators to this problem is to divide the matrix into small blocks, and then multiply only the nonzero blocks. In this paper, we investigate ways to reorder the rows of a sparse matrix to reduce the number of nonzero blocks and cluster the nonzero elements into a few dense blocks. While this pre-processing can be computationally expensive, we show that the high speed-up provided by the accelerators can easily repay the cost, especially when several multiplications follow one reordering.","PeriodicalId":144759,"journal":{"name":"2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3)","volume":"14 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134150843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IA3 2022 Workshop Organization IA3 2022车间组织
2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3) Pub Date : 2022-11-01 DOI: 10.1109/ia356718.2022.00005
{"title":"IA3 2022 Workshop Organization","authors":"","doi":"10.1109/ia356718.2022.00005","DOIUrl":"https://doi.org/10.1109/ia356718.2022.00005","url":null,"abstract":"","PeriodicalId":144759,"journal":{"name":"2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125458132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Page-Address Coalescing of Vector Gather Instructions for Efficient Address Translation 有效地址转换的矢量集合指令页地址合并
2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3) Pub Date : 2022-11-01 DOI: 10.1109/IA356718.2022.00007
Hikaru Takayashiki, Masayuki Sato, K. Komatsu, Hiroaki Kobayashi
{"title":"Page-Address Coalescing of Vector Gather Instructions for Efficient Address Translation","authors":"Hikaru Takayashiki, Masayuki Sato, K. Komatsu, Hiroaki Kobayashi","doi":"10.1109/IA356718.2022.00007","DOIUrl":"https://doi.org/10.1109/IA356718.2022.00007","url":null,"abstract":"Vector gather instructions are available in various processors, which are essential for handling irregular memory accesses. Additionally, the processors support virtual memory that allows programmers not to consider the limitation of the physical memory space. To realize the virtual memory, the processors require address translation between virtual and physical addresses. When a vector gather instruction loads data elements distributed over the physical memory space, all virtual addresses must be translated one by one, causing many translations by accessing a Translation Lookaside Buffer (TLB). Hence, the TLB easily becomes a bottleneck in handling vector gather instructions. To relieve the bottleneck, this paper proposes an address coalescing method for the address translations of vector gather instructions by utilizing vector arithmetic units in the processor. The evaluation results show that the proposed method can achieve a 2x performance improvement in numerical and 1.08x in graph applications, which contain many vector gather instructions.","PeriodicalId":144759,"journal":{"name":"2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131393801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SparseLU, A Novel Algorithm and Math Library for Sparse LU Factorization 稀疏逻辑单元分解的一种新算法和数学库
2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3) Pub Date : 2022-11-01 DOI: 10.1109/IA356718.2022.00010
Pedro Valero-Lara, Cameron Greenwalt, J. Vetter
{"title":"SparseLU, A Novel Algorithm and Math Library for Sparse LU Factorization","authors":"Pedro Valero-Lara, Cameron Greenwalt, J. Vetter","doi":"10.1109/IA356718.2022.00010","DOIUrl":"https://doi.org/10.1109/IA356718.2022.00010","url":null,"abstract":"Decomposing sparse matrices into lower and upper triangular matrices (sparse LU factorization) is a key operation in many computational scientific applications. We developed SparseLU, a sparse linear algebra library that implements a new algorithm for LU factorization on general sparse matrices. The new algorithm divides the input matrix into tiles to which OpenMP tasks are created for factorization computation, where only tiles that contain nonzero elements are computed. For comparative performance analysis, we used the reference library SuperLU. Testing was performed on synthetically generated matrices which replicate the conditions of the real-world matrices. SparseLU is able to reach a mean speedup of ~29× compared to SuperLU.","PeriodicalId":144759,"journal":{"name":"2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3)","volume":"1198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130603754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Datalog applications with cuDF 使用cuDF加速数据应用程序
2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3) Pub Date : 2022-11-01 DOI: 10.1109/IA356718.2022.00012
Ahmedur Rahman Shovon, Landon Dyken, Oded Green, Thomas Gilray, Sidharth Kumar
{"title":"Accelerating Datalog applications with cuDF","authors":"Ahmedur Rahman Shovon, Landon Dyken, Oded Green, Thomas Gilray, Sidharth Kumar","doi":"10.1109/IA356718.2022.00012","DOIUrl":"https://doi.org/10.1109/IA356718.2022.00012","url":null,"abstract":"Datalog, a bottom-up declarative logic programming language, has a wide variety of uses for deduction, modeling, and data analysis, across application domains. Datalog can be efficiently implemented using relational algebra primitives such as join, projection and union. While there exist several multi-threaded and multi-core implementations of Datalog, targeting CPU-based systems, our work makes an inroad towards developing a Datalog implementation for GPUs. We demonstrate the feasibility of a high-performance relational algebra backend for a subset of Datalog applications that can effectively leverage the parallelism of GPUs using cuDF. cuDF is a library from the Rapids suite that uses the NVIDIA CUDA programming model for GPU parallelism. It provides similar functionalities to Pandas, a popular data analysis engine. In this paper, we analyze and evaluate the performance of cuDF versus Pandas for two graph-mining problems implemented in Datalog, (1) triangle counting and (2) transitive-closure computation.","PeriodicalId":144759,"journal":{"name":"2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3)","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116666955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Evolution of a New Model of Computation 一种新的计算模型的演变
2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3) Pub Date : 2022-11-01 DOI: 10.1109/IA356718.2022.00008
Brian A. Page, P. Kogge
{"title":"The Evolution of a New Model of Computation","authors":"Brian A. Page, P. Kogge","doi":"10.1109/IA356718.2022.00008","DOIUrl":"https://doi.org/10.1109/IA356718.2022.00008","url":null,"abstract":"The conventional model of parallel programming today involves either copying data across cores (and then having to track its most recent value), or not copying and requiring deep software stacks to perform even the simplest operation on data that is “remote”, i.e., out of the range of loads and stores from the current core. As application requirements grow to larger data sets, with more irregular access to them, both conventional approaches start to exhibit severe scaling limitations. This paper reviews some growing evidence of the potential value of a new model of computation that skirts between the two: data does not move (i.e., is not copied), but computation instead moves to the data. Several different applications involving large sparse computations, streaming of data, and complex mixed mode operations have been coded for a novel platform where thread movement is handled invisibly by the hardware. The evidence to date indicates that parallel scaling for this paradigm can be significantly better than any mix of conventional models.","PeriodicalId":144759,"journal":{"name":"2022 IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114394344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信