当代GPU仿真方法的定量评价

Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems Pub Date : 2018-06-12 DOI:10.1145/3219617.3219658

Akshay Jain, Mahmoud Khairy, Timothy G. Rogers

{"title":"当代GPU仿真方法的定量评价","authors":"Akshay Jain, Mahmoud Khairy, Timothy G. Rogers","doi":"10.1145/3219617.3219658","DOIUrl":null,"url":null,"abstract":"Contemporary Graphics Processing Units (GPUs) are used to accelerate highly parallel compute workloads. For the last decade, researchers in academia and industry have used cycle-level GPU architecture simulators to evaluate future designs. This paper performs an in-depth analysis of commonly accepted GPU simulation methodology, examining the effect both the workload and the choice of instruction set architecture have on the accuracy of a widely-used simulation infrastructure, GPGPU-Sim. We analyze numerous aspects of the architecture, validating the simulation results against real hardware. Based on a characterized set of over 1700 GPU kernels, we demonstrate that while the relative accuracy of compute-intensive workloads is high, inaccuracies in modeling the memory system result in much higher error when memory performance is critical. We then perform a case study using a recently proposed GPU architecture modification, demonstrating that the cross-product of workload characteristics and instruction set architecture choice can have an affect on the predicted efficacy of the technique.","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"110 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"A Quantitative Evaluation of Contemporary GPU Simulation Methodology\",\"authors\":\"Akshay Jain, Mahmoud Khairy, Timothy G. Rogers\",\"doi\":\"10.1145/3219617.3219658\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Contemporary Graphics Processing Units (GPUs) are used to accelerate highly parallel compute workloads. For the last decade, researchers in academia and industry have used cycle-level GPU architecture simulators to evaluate future designs. This paper performs an in-depth analysis of commonly accepted GPU simulation methodology, examining the effect both the workload and the choice of instruction set architecture have on the accuracy of a widely-used simulation infrastructure, GPGPU-Sim. We analyze numerous aspects of the architecture, validating the simulation results against real hardware. Based on a characterized set of over 1700 GPU kernels, we demonstrate that while the relative accuracy of compute-intensive workloads is high, inaccuracies in modeling the memory system result in much higher error when memory performance is critical. We then perform a case study using a recently proposed GPU architecture modification, demonstrating that the cross-product of workload characteristics and instruction set architecture choice can have an affect on the predicted efficacy of the technique.\",\"PeriodicalId\":210440,\"journal\":{\"name\":\"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems\",\"volume\":\"110 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3219617.3219658\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3219617.3219658","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

现代图形处理单元(gpu)用于加速高度并行的计算工作负载。在过去的十年中，学术界和工业界的研究人员已经使用周期级GPU架构模拟器来评估未来的设计。本文对普遍接受的GPU仿真方法进行了深入分析，研究了工作量和指令集架构的选择对广泛使用的仿真基础架构GPGPU-Sim的准确性的影响。我们分析了该体系结构的许多方面，并针对实际硬件验证了仿真结果。基于1700多个GPU内核的特征集，我们证明了虽然计算密集型工作负载的相对准确性很高，但当内存性能至关重要时，内存系统建模的不准确性会导致更高的错误。然后，我们使用最近提出的GPU架构修改进行了案例研究，证明了工作负载特征和指令集架构选择的交叉乘积可以影响该技术的预测效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Quantitative Evaluation of Contemporary GPU Simulation Methodology

Contemporary Graphics Processing Units (GPUs) are used to accelerate highly parallel compute workloads. For the last decade, researchers in academia and industry have used cycle-level GPU architecture simulators to evaluate future designs. This paper performs an in-depth analysis of commonly accepted GPU simulation methodology, examining the effect both the workload and the choice of instruction set architecture have on the accuracy of a widely-used simulation infrastructure, GPGPU-Sim. We analyze numerous aspects of the architecture, validating the simulation results against real hardware. Based on a characterized set of over 1700 GPU kernels, we demonstrate that while the relative accuracy of compute-intensive workloads is high, inaccuracies in modeling the memory system result in much higher error when memory performance is critical. We then perform a case study using a recently proposed GPU architecture modification, demonstrating that the cross-product of workload characteristics and instruction set architecture choice can have an affect on the predicted efficacy of the technique.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems

自引率

0.00%

发文量