提高容错应用的性能:一个现成的神经加速器的近似案例研究

2021 IEEE V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI) Pub Date : 2021-10-25 DOI:10.1109/jocici54528.2021.9794353

Tomas Gonzalez-Aragon, Jorge Castro-Godínez

{"title":"提高容错应用的性能:一个现成的神经加速器的近似案例研究","authors":"Tomas Gonzalez-Aragon, Jorge Castro-Godínez","doi":"10.1109/jocici54528.2021.9794353","DOIUrl":null,"url":null,"abstract":"Trending workloads and applications are leading many of the new advances in computer architecture and design paradigms. For instance, deep learning applications are transforming the way we do computing. On one hand, specific architectures are currently commercialized as neural processing units, specialized hardware accelerators for these types of applications, achieving significant performance improvements. On the other hand, design paradigms, such as approximate computing, exploit existing inherent tolerance to imprecise computations in these applications to reduce their computation complexity and produce energy-efficient implementations. Relevant available approximations are limited to the software layer to improve the performance of deep learning applications when using an off-the-shelf specialized accelerator alongside edge computing platforms. In this work, we present a case study of performance improvement by introducing approximate computing techniques to three deep learning classification applications. Our test platform is a Raspberry Pi 4, as edge computing device, and a Movidius Myriad X, as neural accelerator. Our experimental results show that using a mixture of approximate techniques can achieve a performance improvement from 20x to 48x with no accuracy degradation for a compute-intensive classification application.","PeriodicalId":339143,"journal":{"name":"2021 IEEE V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI)","volume":"9 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Performance of Error-Tolerant Applications: A Case Study of Approximations on an Off-the-Shelf Neural Accelerator\",\"authors\":\"Tomas Gonzalez-Aragon, Jorge Castro-Godínez\",\"doi\":\"10.1109/jocici54528.2021.9794353\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Trending workloads and applications are leading many of the new advances in computer architecture and design paradigms. For instance, deep learning applications are transforming the way we do computing. On one hand, specific architectures are currently commercialized as neural processing units, specialized hardware accelerators for these types of applications, achieving significant performance improvements. On the other hand, design paradigms, such as approximate computing, exploit existing inherent tolerance to imprecise computations in these applications to reduce their computation complexity and produce energy-efficient implementations. Relevant available approximations are limited to the software layer to improve the performance of deep learning applications when using an off-the-shelf specialized accelerator alongside edge computing platforms. In this work, we present a case study of performance improvement by introducing approximate computing techniques to three deep learning classification applications. Our test platform is a Raspberry Pi 4, as edge computing device, and a Movidius Myriad X, as neural accelerator. Our experimental results show that using a mixture of approximate techniques can achieve a performance improvement from 20x to 48x with no accuracy degradation for a compute-intensive classification application.\",\"PeriodicalId\":339143,\"journal\":{\"name\":\"2021 IEEE V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI)\",\"volume\":\"9 3\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/jocici54528.2021.9794353\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/jocici54528.2021.9794353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

趋势工作负载和应用程序正在引领计算机体系结构和设计范式的许多新进展。例如，深度学习应用正在改变我们计算的方式。一方面，特定的架构目前被商业化为神经处理单元，为这些类型的应用提供专门的硬件加速器，实现了显著的性能改进。另一方面，设计范式，如近似计算，利用这些应用中对不精确计算的固有容忍度来降低其计算复杂性并产生节能实现。当使用现成的专用加速器和边缘计算平台时，相关的可用近似仅限于软件层，以提高深度学习应用程序的性能。在这项工作中，我们通过将近似计算技术引入三种深度学习分类应用程序来提出性能改进的案例研究。我们的测试平台是作为边缘计算设备的树莓派4，以及作为神经加速器的Movidius Myriad X。我们的实验结果表明，对于计算密集型的分类应用程序，使用混合近似技术可以实现从20倍到48倍的性能改进，而精度没有下降。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving Performance of Error-Tolerant Applications: A Case Study of Approximations on an Off-the-Shelf Neural Accelerator

Trending workloads and applications are leading many of the new advances in computer architecture and design paradigms. For instance, deep learning applications are transforming the way we do computing. On one hand, specific architectures are currently commercialized as neural processing units, specialized hardware accelerators for these types of applications, achieving significant performance improvements. On the other hand, design paradigms, such as approximate computing, exploit existing inherent tolerance to imprecise computations in these applications to reduce their computation complexity and produce energy-efficient implementations. Relevant available approximations are limited to the software layer to improve the performance of deep learning applications when using an off-the-shelf specialized accelerator alongside edge computing platforms. In this work, we present a case study of performance improvement by introducing approximate computing techniques to three deep learning classification applications. Our test platform is a Raspberry Pi 4, as edge computing device, and a Movidius Myriad X, as neural accelerator. Our experimental results show that using a mixture of approximate techniques can achieve a performance improvement from 20x to 48x with no accuracy degradation for a compute-intensive classification application.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI)

自引率

0.00%

发文量