ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming最新文献

筛选
英文 中文
Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU 俄罗斯方块:利用 GPU 上的内存重用加速稀疏卷积
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638471
Xiaoyan Liu, Xuegui Zheng, Hailong Yang, Zhongzhi Luan, Depei Qian
{"title":"Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU","authors":"Xiaoyan Liu, Xuegui Zheng, Hailong Yang, Zhongzhi Luan, Depei Qian","doi":"10.1145/3627535.3638471","DOIUrl":"https://doi.org/10.1145/3627535.3638471","url":null,"abstract":"","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"9 6","pages":"229-242"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139958222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
POSTER: Pattern-Aware Sparse Communication for Scalable Recommendation Model Training 海报:用于可扩展推荐模型训练的模式感知稀疏通信
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638481
Jiaao He, Shengqi Chen, Jidong Zhai
{"title":"POSTER: Pattern-Aware Sparse Communication for Scalable Recommendation Model Training","authors":"Jiaao He, Shengqi Chen, Jidong Zhai","doi":"10.1145/3627535.3638481","DOIUrl":"https://doi.org/10.1145/3627535.3638481","url":null,"abstract":"","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"470 2","pages":"466-468"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140446868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
POSTER: FineCo: Fine-grained Heterogeneous Resource Management for Concurrent DNN Inferences 海报:FineCo:用于并发 DNN 推断的细粒度异构资源管理
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638485
Lixian Ma, Haoruo Chen, En Shao, Leping Wang, Quan Chen, Guangming Tan
{"title":"POSTER: FineCo: Fine-grained Heterogeneous Resource Management for Concurrent DNN Inferences","authors":"Lixian Ma, Haoruo Chen, En Shao, Leping Wang, Quan Chen, Guangming Tan","doi":"10.1145/3627535.3638485","DOIUrl":"https://doi.org/10.1145/3627535.3638485","url":null,"abstract":"","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"81 4","pages":"451-453"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140448094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
POSTER: StructMG: A Fast and Scalable Structured Multigrid 海报:StructMG:快速、可扩展的结构化多网格
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638482
Yi Zong, Xinliang Wang, Haopeng Huang, Chensong Zhang, Xiaowen Xu, Jian Sun, Bowen Yan, Qin Wang, Sicong Li, Zhaohui Ding, Wei Xue
{"title":"POSTER: StructMG: A Fast and Scalable Structured Multigrid","authors":"Yi Zong, Xinliang Wang, Haopeng Huang, Chensong Zhang, Xiaowen Xu, Jian Sun, Bowen Yan, Qin Wang, Sicong Li, Zhaohui Ding, Wei Xue","doi":"10.1145/3627535.3638482","DOIUrl":"https://doi.org/10.1145/3627535.3638482","url":null,"abstract":"","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"95 ","pages":"478-480"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140445332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Scalable Unstructured Mesh Computations on Shared Memory Many-Cores 在多核共享内存上实现可扩展的非结构化网格计算
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638473
Haozhong Qiu, Chuanfu Xu, Jianbin Fang, Liang Deng, Jian Zhang, Qingsong Wang, Yue Ding, Z. Dai, Yonggang Che, Shizhao Chen, Jie Liu
{"title":"Towards Scalable Unstructured Mesh Computations on Shared Memory Many-Cores","authors":"Haozhong Qiu, Chuanfu Xu, Jianbin Fang, Liang Deng, Jian Zhang, Qingsong Wang, Yue Ding, Z. Dai, Yonggang Che, Shizhao Chen, Jie Liu","doi":"10.1145/3627535.3638473","DOIUrl":"https://doi.org/10.1145/3627535.3638473","url":null,"abstract":"","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"561 ","pages":"109-119"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140446871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters FastFold:在 GPU 集群上优化 AlphaFold 训练和推理
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638465
Shenggan Cheng, Xuanlei Zhao, Guangyang Lu, Jiarui Fang, Tian Zheng, R. Wu, Xiwen Zhang, Jian Peng, Yang You
{"title":"FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters","authors":"Shenggan Cheng, Xuanlei Zhao, Guangyang Lu, Jiarui Fang, Tian Zheng, R. Wu, Xiwen Zhang, Jian Peng, Yang You","doi":"10.1145/3627535.3638465","DOIUrl":"https://doi.org/10.1145/3627535.3638465","url":null,"abstract":"","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"206 1","pages":"417-430"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140448279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
INFINEL: An efficient GPU-based processing method for unpredictable large output graph queries INFINEL:一种基于 GPU 的高效处理方法,用于不可预测的大型输出图查询
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638490
Sungwoo Park, Seyeon Oh, Min-Soo Kim
{"title":"INFINEL: An efficient GPU-based processing method for unpredictable large output graph queries","authors":"Sungwoo Park, Seyeon Oh, Min-Soo Kim","doi":"10.1145/3627535.3638490","DOIUrl":"https://doi.org/10.1145/3627535.3638490","url":null,"abstract":"","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"128 ","pages":"147-159"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140446703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ConvStencil: Transform Stencil Computation to Matrix Multiplication on Tensor Cores ConvStencil:将模板计算转换为张量核上的矩阵乘法
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638476
Yuetao Chen, Kun Li, Yuhao Wang, Donglin Bai, Lei Wang, Lingxiao Ma, Liang Yuan, Yunquan Zhang, Ting Cao, Mao Yang
{"title":"ConvStencil: Transform Stencil Computation to Matrix Multiplication on Tensor Cores","authors":"Yuetao Chen, Kun Li, Yuhao Wang, Donglin Bai, Lei Wang, Lingxiao Ma, Liang Yuan, Yunquan Zhang, Ting Cao, Mao Yang","doi":"10.1145/3627535.3638476","DOIUrl":"https://doi.org/10.1145/3627535.3638476","url":null,"abstract":"Tensor Core Unit (TCU) is increasingly integrated into modern high-performance processors to enhance matrix multiplication performance. However, constrained to its over-specification, its potential for improving other critical scientific operations like stencil computations remains untapped. This paper presents ConvStencil 1 , a novel stencil computing system designed to efficiently transform stencil computation to matrix multiplication on Tensor Cores. We first develop a performance model for ConvStencil to guide al-gorithm design and optimization on TCUs. Based on this model, we propose three techniques: (1) Memory-efficient Layout Transformation using the stencil2row method; (2)","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"485 ","pages":"333-347"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140448218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recurrence Analysis for Automatic Parallelization of Subscripted Subscripts 下标自动并行化的递归分析
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638493
Akshay Bhosale, Rudolf Eigenmann
{"title":"Recurrence Analysis for Automatic Parallelization of Subscripted Subscripts","authors":"Akshay Bhosale, Rudolf Eigenmann","doi":"10.1145/3627535.3638493","DOIUrl":"https://doi.org/10.1145/3627535.3638493","url":null,"abstract":"","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"237 1","pages":"80-93"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140448611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
POSTER: ParGNN: Efficient Training for Large-Scale Graph Neural Network on GPU Clusters 海报: ParGNN:在GPU集群上高效训练大规模图神经网络
ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2024-02-20 DOI: 10.1145/3627535.3638488
Shunde Li, Junyu Gu, Jue Wang, Tiechui Yao, Zhiqiang Liang, Yumeng Shi, Shigang Li, Weiting Xi, Shushen Li, Chunbao Zhou, Yangang Wang, Xuebin Chi
{"title":"POSTER: ParGNN: Efficient Training for Large-Scale Graph Neural Network on GPU Clusters","authors":"Shunde Li, Junyu Gu, Jue Wang, Tiechui Yao, Zhiqiang Liang, Yumeng Shi, Shigang Li, Weiting Xi, Shushen Li, Chunbao Zhou, Yangang Wang, Xuebin Chi","doi":"10.1145/3627535.3638488","DOIUrl":"https://doi.org/10.1145/3627535.3638488","url":null,"abstract":"","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"12 11","pages":"469-471"},"PeriodicalIF":0.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139958207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信