以 GPU 为中心的通信格局

Didem Unat, Ilyas Turimbetov, Mohammed Kefah Taha Issa, Doğan Sağbili, Flavio Vella, Daniele De Sensi, Ismayil Ismayilov
{"title":"以 GPU 为中心的通信格局","authors":"Didem Unat, Ilyas Turimbetov, Mohammed Kefah Taha Issa, Doğan Sağbili, Flavio Vella, Daniele De Sensi, Ismayil Ismayilov","doi":"arxiv-2409.09874","DOIUrl":null,"url":null,"abstract":"n recent years, GPUs have become the preferred accelerators for HPC and ML\napplications due to their parallelism and fast memory bandwidth. While GPUs\nboost computation, inter-GPU communication can create scalability bottlenecks,\nespecially as the number of GPUs per node and cluster grows. Traditionally, the\nCPU managed multi-GPU communication, but advancements in GPU-centric\ncommunication now challenge this CPU dominance by reducing its involvement,\ngranting GPUs more autonomy in communication tasks, and addressing mismatches\nin multi-GPU communication and computation. This paper provides a landscape of GPU-centric communication, focusing on\nvendor mechanisms and user-level library supports. It aims to clarify the\ncomplexities and diverse options in this field, define the terminology, and\ncategorize existing approaches within and across nodes. The paper discusses\nvendor-provided mechanisms for communication and memory management in multi-GPU\nexecution and reviews major communication libraries, their benefits,\nchallenges, and performance insights. Then, it explores key research paradigms,\nfuture outlooks, and open research questions. By extensively describing\nGPU-centric communication techniques across the software and hardware stacks,\nwe provide researchers, programmers, engineers, and library designers insights\non how to exploit multi-GPU systems at their best.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Landscape of GPU-Centric Communication\",\"authors\":\"Didem Unat, Ilyas Turimbetov, Mohammed Kefah Taha Issa, Doğan Sağbili, Flavio Vella, Daniele De Sensi, Ismayil Ismayilov\",\"doi\":\"arxiv-2409.09874\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"n recent years, GPUs have become the preferred accelerators for HPC and ML\\napplications due to their parallelism and fast memory bandwidth. While GPUs\\nboost computation, inter-GPU communication can create scalability bottlenecks,\\nespecially as the number of GPUs per node and cluster grows. Traditionally, the\\nCPU managed multi-GPU communication, but advancements in GPU-centric\\ncommunication now challenge this CPU dominance by reducing its involvement,\\ngranting GPUs more autonomy in communication tasks, and addressing mismatches\\nin multi-GPU communication and computation. This paper provides a landscape of GPU-centric communication, focusing on\\nvendor mechanisms and user-level library supports. It aims to clarify the\\ncomplexities and diverse options in this field, define the terminology, and\\ncategorize existing approaches within and across nodes. The paper discusses\\nvendor-provided mechanisms for communication and memory management in multi-GPU\\nexecution and reviews major communication libraries, their benefits,\\nchallenges, and performance insights. Then, it explores key research paradigms,\\nfuture outlooks, and open research questions. By extensively describing\\nGPU-centric communication techniques across the software and hardware stacks,\\nwe provide researchers, programmers, engineers, and library designers insights\\non how to exploit multi-GPU systems at their best.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"31 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09874\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09874","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近年来,GPU 因其并行性和快速内存带宽而成为 HPC 和 ML 应用的首选加速器。虽然 GPU 可以提高计算能力,但 GPU 之间的通信会造成可扩展性瓶颈,尤其是当每个节点和集群的 GPU 数量增加时。传统上,CPU 负责管理多 GPU 通信,但现在以 GPU 为中心的通信技术的进步挑战了 CPU 的主导地位,减少了 CPU 的参与,赋予 GPU 在通信任务中更多的自主权,并解决了多 GPU 通信和计算中的不匹配问题。本文介绍了以 GPU 为中心的通信,重点是供应商机制和用户级库支持。本文旨在阐明该领域的复杂性和多种选择,定义术语,并对节点内和节点间的现有方法进行分类。本文讨论了供应商提供的多 GPU 执行中的通信和内存管理机制,回顾了主要的通信库、其优势、挑战和性能见解。然后,论文探讨了关键研究范例、未来展望和开放研究问题。通过广泛介绍软件和硬件堆栈中以 GPU 为中心的通信技术,我们为研究人员、程序员、工程师和库设计人员提供了如何以最佳方式利用多 GPU 系统的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Landscape of GPU-Centric Communication
n recent years, GPUs have become the preferred accelerators for HPC and ML applications due to their parallelism and fast memory bandwidth. While GPUs boost computation, inter-GPU communication can create scalability bottlenecks, especially as the number of GPUs per node and cluster grows. Traditionally, the CPU managed multi-GPU communication, but advancements in GPU-centric communication now challenge this CPU dominance by reducing its involvement, granting GPUs more autonomy in communication tasks, and addressing mismatches in multi-GPU communication and computation. This paper provides a landscape of GPU-centric communication, focusing on vendor mechanisms and user-level library supports. It aims to clarify the complexities and diverse options in this field, define the terminology, and categorize existing approaches within and across nodes. The paper discusses vendor-provided mechanisms for communication and memory management in multi-GPU execution and reviews major communication libraries, their benefits, challenges, and performance insights. Then, it explores key research paradigms, future outlooks, and open research questions. By extensively describing GPU-centric communication techniques across the software and hardware stacks, we provide researchers, programmers, engineers, and library designers insights on how to exploit multi-GPU systems at their best.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信