SFU-Driven Transparent Approximation Acceleration on GPUs

Proceedings of the 2016 International Conference on Supercomputing Pub Date : 2016-06-01 DOI:10.1145/2925426.2926255

Ang Li, S. Song, M. Wijtvliet, Akash Kumar, H. Corporaal

{"title":"SFU-Driven Transparent Approximation Acceleration on GPUs","authors":"Ang Li, S. Song, M. Wijtvliet, Akash Kumar, H. Corporaal","doi":"10.1145/2925426.2926255","DOIUrl":null,"url":null,"abstract":"Approximate computing, the technique that sacrifices certain amount of accuracy in exchange for substantial performance boost or power reduction, is one of the most promising solutions to enable power control and performance scaling towards exascale. Although most existing approximation designs target the emerging data-intensive applications that are comparatively more error-tolerable, there is still high demand for the acceleration of traditional scientific applications (e.g., weather and nuclear simulation), which often comprise intensive transcendental function calls and are very sensitive to accuracy loss. To address this challenge, we focus on a very important but long ignored approximation unit on today's commercial GPUs --- the special-function unit (SFU), and clarify its unique role in performance acceleration of accuracy-sensitive applications in the context of approximate computing. To better understand its features, we conduct a thorough empirical analysis on three generations of NVIDIA GPU architectures to evaluate all the single-precision and double-precision numeric transcendental functions that can be accelerated by SFUs, in terms of their performance, accuracy and power consumption. Based on the insights from the evaluation, we propose a transparent, tractable and portable design framework for SFU-driven approximate acceleration on GPUs. Our design is software-based and requires no hardware or application modifications. Experimental results on three NVIDIA GPU platforms demonstrate that our proposed framework can provide fine-grained tuning for performance and accuracy trade-offs, thus facilitating applications to achieve the maximum performance under certain accuracy constraints.","PeriodicalId":422112,"journal":{"name":"Proceedings of the 2016 International Conference on Supercomputing","volume":"251 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 International Conference on Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2925426.2926255","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 30

Abstract

Approximate computing, the technique that sacrifices certain amount of accuracy in exchange for substantial performance boost or power reduction, is one of the most promising solutions to enable power control and performance scaling towards exascale. Although most existing approximation designs target the emerging data-intensive applications that are comparatively more error-tolerable, there is still high demand for the acceleration of traditional scientific applications (e.g., weather and nuclear simulation), which often comprise intensive transcendental function calls and are very sensitive to accuracy loss. To address this challenge, we focus on a very important but long ignored approximation unit on today's commercial GPUs --- the special-function unit (SFU), and clarify its unique role in performance acceleration of accuracy-sensitive applications in the context of approximate computing. To better understand its features, we conduct a thorough empirical analysis on three generations of NVIDIA GPU architectures to evaluate all the single-precision and double-precision numeric transcendental functions that can be accelerated by SFUs, in terms of their performance, accuracy and power consumption. Based on the insights from the evaluation, we propose a transparent, tractable and portable design framework for SFU-driven approximate acceleration on GPUs. Our design is software-based and requires no hardware or application modifications. Experimental results on three NVIDIA GPU platforms demonstrate that our proposed framework can provide fine-grained tuning for performance and accuracy trade-offs, thus facilitating applications to achieve the maximum performance under certain accuracy constraints.

查看原文本刊更多论文

gpu上的sfu驱动的透明近似加速

近似计算(Approximate computing)是一种牺牲一定精度以换取大幅性能提升或功耗降低的技术，是实现功率控制和性能向百亿亿级扩展的最有前途的解决方案之一。尽管大多数现有的近似设计针对的是相对更容易容忍错误的新兴数据密集型应用，但对传统科学应用(例如天气和核模拟)的加速仍然有很高的要求，这些应用通常包含密集的超越函数调用，并且对精度损失非常敏感。为了应对这一挑战，我们将重点放在当今商用gpu上一个非常重要但长期被忽视的近似单元——特殊功能单元(SFU)上，并阐明其在近似计算背景下对精度敏感的应用程序的性能加速中的独特作用。为了更好地了解其特性，我们对三代NVIDIA GPU架构进行了全面的实证分析，以评估sfu可以加速的所有单精度和双精度数值超越函数的性能，精度和功耗。基于评估的见解，我们提出了一个透明、易于处理和便携的设计框架，用于gpu上的sfu驱动的近似加速。我们的设计是基于软件的，不需要硬件或应用程序的修改。在三个NVIDIA GPU平台上的实验结果表明，我们提出的框架可以为性能和精度之间的权衡提供细粒度的调整，从而促进应用程序在一定的精度约束下实现最大性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2016 International Conference on Supercomputing

自引率

0.00%

发文量