ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming最新文献_第3页

Scaling Up Transactions with Slower Clocks 使用较慢的时钟扩大交易规模

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638472

P. Ramalhete, Andreia Correia

引用次数: 0

POSTER: RELAX: Durable Data Structures with Swift Recovery 海报：RELAX：使用 Swift 恢复的持久数据结构

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638469

Almog Zur, Nachshon Cohen, Michal Friedman, E. Petrank

引用次数: 0

A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine Programs 为仿射程序自动生成和调整混合精度代码的整体方法

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638484

Jinchen Xu, Guanghui Song, Bei Zhou, Fei Li, Jiangwei Hao, Jie Zhao

引用次数: 0

Practical Hardware Transactional vEB Trees 实用硬件事务 vEB 树

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638504

Mohammad Khalaji, Trevor Brown, Khuzaima S. Daudjee, V. Aksenov

引用次数: 0

Sparsity in Deep Neural Nets (Keynote) 深度神经网络中的稀疏性（主题演讲）

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638568

N. Shavit

引用次数: 0

POSTER: Optimizing Sparse Tensor Contraction with Revisiting Hash Table Design 海报：通过重新审视哈希表设计优化稀疏张量收缩

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638500

Guofeng Feng, Weile Jia, Ninghui Sun, Guangming Tan, Jiajia Li

引用次数: 0

Training one DeePMD Model in Minutes: a Step towards Online Learning 在几分钟内培训一个 DeePMD 模型：向在线学习迈出的一步

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638505

Siyu Hu, Tong Zhao, Qiuchen Sha, Enji Li, Xiangyu Meng, Liping Liu, Lin-Wang Wang, Guangming Tan, Weile Jia

引用次数: 0

POSTER: Accelerating High-Precision Integer Multiplication used in Cryptosystems with GPUs 海报：利用 GPU 加速密码系统中的高精度整数乘法运算

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638495

Zhuoran Ji, Zhaorui Zhang, Jiming Xu, Lei Ju

引用次数: 0

Gallatin: A General-Purpose GPU Memory Manager Gallatin：通用 GPU 内存管理器

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638499

Hunter McCoy, Prashant Pandey

{"title":"Gallatin: A General-Purpose GPU Memory Manager","authors":"Hunter McCoy, Prashant Pandey","doi":"10.1145/3627535.3638499","DOIUrl":"https://doi.org/10.1145/3627535.3638499","url":null,"abstract":"Dynamic memory management is critical for efficiently porting modern data processing pipelines to GPUs. However, building a general-purpose dynamic memory manager on GPUs is challenging due to the massive parallelism and weak memory coherence. Existing state-of-the-art GPU memory managers, Ouroboros and Reg-Eff, employ traditional data structures such as arrays and linked lists to manage memory objects. They build specialized pipelines to achieve performance for a fixed set of allocation sizes and fall back to the CUDA allocator for allocating large sizes. In the process, they lose general-purpose usability and fail to support critical applications such as streaming graph processing. In this paper, we introduce Gallatin, a general-purpose and high-performance GPU memory manager. Gallatin uses the van Emde Boas (vEB) tree data structure to manage memory objects efficiently and supports allocations of any size. Furthermore,wedevelopahighly-concurrentGPUimplemen-tationofthevEBtreewhichcanbebroadlyusedinotherGPU applications.Itsupportsconstanttimeinsertions,deletions, andsuccessoroperationsforagivenmemorysize. Inourevaluation,wecompareGallatinwithstate-of-the-artspecializedallocatorvariants.Gallatinisupto374 × faster onsingle-sizedallocationsandupto264 × fasteronmixed-size allocations than the next-best allocator. In scalability benchmarks, Gallatin is up to 254 × times faster than the next-best allocator as the number of threads increases. For the graph benchmarks, Gallatin is 1 . 5 × faster than the state-of-the-art for bulk insertions, slightly faster for bulk deletions, and is 3 × faster than the next-best allocator for all graph expansion tests.","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"329 ","pages":"364-376"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140448099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VERLIB: Concurrent Versioned Pointers VERLIB：并发版本控制指针

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638501

G. Blelloch, Yuanhao Wei

引用次数: 0