Paraprox:数据并行应用的基于模式的近似

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems Pub Date : 2014-02-24 DOI:10.1145/2541940.2541948

M. Samadi, D. Jamshidi, Janghaeng Lee, S. Mahlke

{"title":"Paraprox:数据并行应用的基于模式的近似","authors":"M. Samadi, D. Jamshidi, Janghaeng Lee, S. Mahlke","doi":"10.1145/2541940.2541948","DOIUrl":null,"url":null,"abstract":"Approximate computing is an approach where reduced accuracy of results is traded off for increased speed, throughput, or both. Loss of accuracy is not permissible in all computing domains, but there are a growing number of data-intensive domains where the output of programs need not be perfectly correct to provide useful results or even noticeable differences to the end user. These soft domains include multimedia processing, machine learning, and data mining/analysis. An important challenge with approximate computing is transparency to insulate both software and hardware developers from the time, cost, and difficulty of using approximation. This paper proposes a software-only system, Paraprox, for realizing transparent approximation of data-parallel programs that operates on commodity hardware systems. Paraprox starts with a data-parallel kernel implemented using OpenCL or CUDA and creates a parameterized approximate kernel that is tuned at runtime to maximize performance subject to a target output quality (TOQ) that is supplied by the user. Approximate kernels are created by recognizing common computation idioms found in data-parallel programs (e.g., Map, Scatter/Gather, Reduction, Scan, Stencil, and Partition) and substituting approximate implementations in their place. Across a set of 13 soft data-parallel applications with at most 10% quality degradation, Paraprox yields an average performance gain of 2.7x on a NVIDIA GTX 560 GPU and 2.5x on an Intel Core i7 quad-core processor compared to accurate execution on each platform.","PeriodicalId":128805,"journal":{"name":"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"215","resultStr":"{\"title\":\"Paraprox: pattern-based approximation for data parallel applications\",\"authors\":\"M. Samadi, D. Jamshidi, Janghaeng Lee, S. Mahlke\",\"doi\":\"10.1145/2541940.2541948\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Approximate computing is an approach where reduced accuracy of results is traded off for increased speed, throughput, or both. Loss of accuracy is not permissible in all computing domains, but there are a growing number of data-intensive domains where the output of programs need not be perfectly correct to provide useful results or even noticeable differences to the end user. These soft domains include multimedia processing, machine learning, and data mining/analysis. An important challenge with approximate computing is transparency to insulate both software and hardware developers from the time, cost, and difficulty of using approximation. This paper proposes a software-only system, Paraprox, for realizing transparent approximation of data-parallel programs that operates on commodity hardware systems. Paraprox starts with a data-parallel kernel implemented using OpenCL or CUDA and creates a parameterized approximate kernel that is tuned at runtime to maximize performance subject to a target output quality (TOQ) that is supplied by the user. Approximate kernels are created by recognizing common computation idioms found in data-parallel programs (e.g., Map, Scatter/Gather, Reduction, Scan, Stencil, and Partition) and substituting approximate implementations in their place. Across a set of 13 soft data-parallel applications with at most 10% quality degradation, Paraprox yields an average performance gain of 2.7x on a NVIDIA GTX 560 GPU and 2.5x on an Intel Core i7 quad-core processor compared to accurate execution on each platform.\",\"PeriodicalId\":128805,\"journal\":{\"name\":\"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"215\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2541940.2541948\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2541940.2541948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 215

摘要

近似计算是一种以降低结果准确性为代价来提高速度、吞吐量或两者兼而有之的方法。不是在所有计算领域都允许失去准确性，但是在越来越多的数据密集型领域中，程序的输出不需要完全正确才能提供有用的结果，甚至不需要为最终用户提供明显的差异。这些软领域包括多媒体处理、机器学习和数据挖掘/分析。近似计算的一个重要挑战是将软件和硬件开发人员从使用近似的时间、成本和难度中隔离出来。本文提出了一个纯软件系统Paraprox，用于实现在商用硬件系统上运行的数据并行程序的透明逼近。Paraprox从使用OpenCL或CUDA实现的数据并行内核开始，并创建一个参数化的近似内核，该内核在运行时进行调整，以根据用户提供的目标输出质量(TOQ)最大化性能。近似核是通过识别数据并行程序中常见的计算习惯(例如，Map、Scatter/Gather、Reduction、Scan、Stencil和Partition)并在其位置替换近似实现来创建的。在一组13个软数据并行应用程序中，Paraprox在NVIDIA GTX 560 GPU上的平均性能提高了2.7倍，在英特尔酷睿i7四核处理器上的平均性能提高了2.5倍，而在每个平台上的精确执行都是如此。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Paraprox: pattern-based approximation for data parallel applications

Approximate computing is an approach where reduced accuracy of results is traded off for increased speed, throughput, or both. Loss of accuracy is not permissible in all computing domains, but there are a growing number of data-intensive domains where the output of programs need not be perfectly correct to provide useful results or even noticeable differences to the end user. These soft domains include multimedia processing, machine learning, and data mining/analysis. An important challenge with approximate computing is transparency to insulate both software and hardware developers from the time, cost, and difficulty of using approximation. This paper proposes a software-only system, Paraprox, for realizing transparent approximation of data-parallel programs that operates on commodity hardware systems. Paraprox starts with a data-parallel kernel implemented using OpenCL or CUDA and creates a parameterized approximate kernel that is tuned at runtime to maximize performance subject to a target output quality (TOQ) that is supplied by the user. Approximate kernels are created by recognizing common computation idioms found in data-parallel programs (e.g., Map, Scatter/Gather, Reduction, Scan, Stencil, and Partition) and substituting approximate implementations in their place. Across a set of 13 soft data-parallel applications with at most 10% quality degradation, Paraprox yields an average performance gain of 2.7x on a NVIDIA GTX 560 GPU and 2.5x on an Intel Core i7 quad-core processor compared to accurate execution on each platform.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

自引率

0.00%

发文量