基于CUDA的大规模SIMD计算中多核和GPU处理单元的功耗效率研究

D. Ren, R. Suda
{"title":"基于CUDA的大规模SIMD计算中多核和GPU处理单元的功耗效率研究","authors":"D. Ren, R. Suda","doi":"10.1109/GREENCOMP.2010.5598300","DOIUrl":null,"url":null,"abstract":"CPU-GPU Processing Element (PE) has become a very popular architecture to construct modern multiprocessing system because of its high performance on massively parallel processing and vector computations. Power dissipation is one of the important factors influencing design development of High Performance Computing (HPC) as a large scale scientific computation may use thousands of processors and hundreds hours of continuous execution that will result enormous energy predicament. Enhancing the utilizations of an individual PE to reach its best computation capability and power efficiency is valuable for saving the overall power cost of large multi-processing systems. Power performance of a CUDA PE is dependent on electrical features of the inside hardware components and their interconnections; also high level applications and the parallel algorithms performed on it. Based on measurements and experimental evaluations, in this work we provide a load sharing method to adjust the workload assignment within the CPU and GPU components inside a CUDA PE in order to optimize the overall power efficiency. The improvement on computation time and power consumption has been validated by examining the program executions when above method is applied on real systems.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Investigation on the power efficiency of multi-core and GPU Processing Element in large scale SIMD computation with CUDA\",\"authors\":\"D. Ren, R. Suda\",\"doi\":\"10.1109/GREENCOMP.2010.5598300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"CPU-GPU Processing Element (PE) has become a very popular architecture to construct modern multiprocessing system because of its high performance on massively parallel processing and vector computations. Power dissipation is one of the important factors influencing design development of High Performance Computing (HPC) as a large scale scientific computation may use thousands of processors and hundreds hours of continuous execution that will result enormous energy predicament. Enhancing the utilizations of an individual PE to reach its best computation capability and power efficiency is valuable for saving the overall power cost of large multi-processing systems. Power performance of a CUDA PE is dependent on electrical features of the inside hardware components and their interconnections; also high level applications and the parallel algorithms performed on it. Based on measurements and experimental evaluations, in this work we provide a load sharing method to adjust the workload assignment within the CPU and GPU components inside a CUDA PE in order to optimize the overall power efficiency. The improvement on computation time and power consumption has been validated by examining the program executions when above method is applied on real systems.\",\"PeriodicalId\":262148,\"journal\":{\"name\":\"International Conference on Green Computing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Green Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GREENCOMP.2010.5598300\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Green Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GREENCOMP.2010.5598300","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22

摘要

CPU-GPU处理单元(PE)由于其在大规模并行处理和矢量计算方面的高性能而成为构建现代多处理系统的一种非常流行的架构。大规模的科学计算可能使用数千个处理器,连续执行数百小时,造成巨大的能量困境,功耗是影响高性能计算设计发展的重要因素之一。提高单个PE的利用率以达到其最佳计算能力和功率效率对于节省大型多处理系统的总体功率成本是有价值的。CUDA PE的电源性能取决于内部硬件组件及其互连的电气特性;还有高级应用程序和并行算法在其上执行。在测量和实验评估的基础上,我们提供了一种负载共享方法来调整CUDA PE内部CPU和GPU组件的工作负载分配,以优化整体功耗效率。该方法在实际系统上的应用验证了其在计算时间和功耗方面的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Investigation on the power efficiency of multi-core and GPU Processing Element in large scale SIMD computation with CUDA
CPU-GPU Processing Element (PE) has become a very popular architecture to construct modern multiprocessing system because of its high performance on massively parallel processing and vector computations. Power dissipation is one of the important factors influencing design development of High Performance Computing (HPC) as a large scale scientific computation may use thousands of processors and hundreds hours of continuous execution that will result enormous energy predicament. Enhancing the utilizations of an individual PE to reach its best computation capability and power efficiency is valuable for saving the overall power cost of large multi-processing systems. Power performance of a CUDA PE is dependent on electrical features of the inside hardware components and their interconnections; also high level applications and the parallel algorithms performed on it. Based on measurements and experimental evaluations, in this work we provide a load sharing method to adjust the workload assignment within the CPU and GPU components inside a CUDA PE in order to optimize the overall power efficiency. The improvement on computation time and power consumption has been validated by examining the program executions when above method is applied on real systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信