通过仪器驱动的优化加速集成GPU平台上的图形应用程序

Proceedings of the ACM International Conference on Computing Frontiers Pub Date : 2016-05-16 DOI:10.1145/2903150.2903152

N. Farooqui, Indrajit Roy, Yuan Chen, V. Talwar, K. Schwan

{"title":"通过仪器驱动的优化加速集成GPU平台上的图形应用程序","authors":"N. Farooqui, Indrajit Roy, Yuan Chen, V. Talwar, K. Schwan","doi":"10.1145/2903150.2903152","DOIUrl":null,"url":null,"abstract":"Integrated GPU platforms are a cost-effective and energy-efficient option for accelerating data-intensive applications. While these platforms have reduced overhead of offloading computation to the GPU and potential for fine-grained resource scheduling, there remain several open challenges. First, substantial application knowledge is required to leverage GPU acceleration capabilities. Second, static application profiling is inadequate for extracting performance from graph applications that exhibit input-dependent, irregular runtime behaviors. Third, naive scheduling of applications on both CPU and GPU devices may degrade performance due to memory contention. We describe Luminar, a runtime, profile-guided approach to accelerating applications on integrated GPU platforms. By using efficient dynamic instrumentation, Luminar informs resource scheduling about current workload properties. Luminar engenders up to 40% improvements for irregular, graph-based applications, plus 21-80% improvements in throughput and from 3-60% improvements in energy efficiency when scheduling a mix of applications.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Accelerating graph applications on integrated GPU platforms via instrumentation-driven optimizations\",\"authors\":\"N. Farooqui, Indrajit Roy, Yuan Chen, V. Talwar, K. Schwan\",\"doi\":\"10.1145/2903150.2903152\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Integrated GPU platforms are a cost-effective and energy-efficient option for accelerating data-intensive applications. While these platforms have reduced overhead of offloading computation to the GPU and potential for fine-grained resource scheduling, there remain several open challenges. First, substantial application knowledge is required to leverage GPU acceleration capabilities. Second, static application profiling is inadequate for extracting performance from graph applications that exhibit input-dependent, irregular runtime behaviors. Third, naive scheduling of applications on both CPU and GPU devices may degrade performance due to memory contention. We describe Luminar, a runtime, profile-guided approach to accelerating applications on integrated GPU platforms. By using efficient dynamic instrumentation, Luminar informs resource scheduling about current workload properties. Luminar engenders up to 40% improvements for irregular, graph-based applications, plus 21-80% improvements in throughput and from 3-60% improvements in energy efficiency when scheduling a mix of applications.\",\"PeriodicalId\":226569,\"journal\":{\"name\":\"Proceedings of the ACM International Conference on Computing Frontiers\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM International Conference on Computing Frontiers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2903150.2903152\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2903150.2903152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

集成GPU平台是加速数据密集型应用的一种经济高效且节能的选择。虽然这些平台已经减少了将计算卸载到GPU的开销，并且具有细粒度资源调度的潜力，但仍然存在一些开放的挑战。首先，需要大量的应用程序知识来利用GPU加速功能。其次，静态应用程序分析不足以从表现出依赖输入、不规则运行时行为的图形应用程序中提取性能。第三，在CPU和GPU设备上对应用程序进行天真的调度可能会由于内存争用而降低性能。我们描述了Luminar，这是一种运行时，配置文件引导的方法，用于加速集成GPU平台上的应用程序。通过使用高效的动态检测，Luminar通知资源调度有关当前工作负载属性的信息。Luminar可以为不规则的、基于图形的应用程序带来高达40%的改进，在调度混合应用程序时，吞吐量提高21-80%，能源效率提高3-60%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accelerating graph applications on integrated GPU platforms via instrumentation-driven optimizations

Integrated GPU platforms are a cost-effective and energy-efficient option for accelerating data-intensive applications. While these platforms have reduced overhead of offloading computation to the GPU and potential for fine-grained resource scheduling, there remain several open challenges. First, substantial application knowledge is required to leverage GPU acceleration capabilities. Second, static application profiling is inadequate for extracting performance from graph applications that exhibit input-dependent, irregular runtime behaviors. Third, naive scheduling of applications on both CPU and GPU devices may degrade performance due to memory contention. We describe Luminar, a runtime, profile-guided approach to accelerating applications on integrated GPU platforms. By using efficient dynamic instrumentation, Luminar informs resource scheduling about current workload properties. Luminar engenders up to 40% improvements for irregular, graph-based applications, plus 21-80% improvements in throughput and from 3-60% improvements in energy efficiency when scheduling a mix of applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the ACM International Conference on Computing Frontiers

自引率

0.00%

发文量