基于运算集架构的fpga深度神经网络的超前编译

IF 1.4 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Computer Architecture Letters Pub Date : 2022-12-13 DOI:10.1109/LCA.2022.3227643

Burkhard Ringlein;Francois Abel;Dionysios Diamantopoulos;Beat Weiss;Christoph Hagleitner;Dietmar Fey

{"title":"基于运算集架构的fpga深度神经网络的超前编译","authors":"Burkhard Ringlein;Francois Abel;Dionysios Diamantopoulos;Beat Weiss;Christoph Hagleitner;Dietmar Fey","doi":"10.1109/LCA.2022.3227643","DOIUrl":null,"url":null,"abstract":"The slow-down of technology scaling combined with the exponential growth of modern machine learning and artificial intelligence models has created a demand for specialized accelerators, such as GPUs, ASICs, and field-programmable gate arrays (FPGAs). FPGAs can be reconfigured and have the potential to outperform other accelerators, while also being more energy-efficient, but are cumbersome to use with today's fractured landscape of tool flows. We propose the concept of an operation set architecture to overcome the current incompatibilities and hurdles in using DNN-to-FPGA compilers by combining existing specialized frameworks into one organic compiler that also allows the efficient and automatic re-use of existing community tools. Furthermore, we demonstrate that mixing different existing frameworks can increase the efficiency by more than an order of magnitude.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":"22 1","pages":"9-12"},"PeriodicalIF":1.4000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Advancing Compilation of DNNs for FPGAs Using Operation Set Architectures\",\"authors\":\"Burkhard Ringlein;Francois Abel;Dionysios Diamantopoulos;Beat Weiss;Christoph Hagleitner;Dietmar Fey\",\"doi\":\"10.1109/LCA.2022.3227643\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The slow-down of technology scaling combined with the exponential growth of modern machine learning and artificial intelligence models has created a demand for specialized accelerators, such as GPUs, ASICs, and field-programmable gate arrays (FPGAs). FPGAs can be reconfigured and have the potential to outperform other accelerators, while also being more energy-efficient, but are cumbersome to use with today's fractured landscape of tool flows. We propose the concept of an operation set architecture to overcome the current incompatibilities and hurdles in using DNN-to-FPGA compilers by combining existing specialized frameworks into one organic compiler that also allows the efficient and automatic re-use of existing community tools. Furthermore, we demonstrate that mixing different existing frameworks can increase the efficiency by more than an order of magnitude.\",\"PeriodicalId\":51248,\"journal\":{\"name\":\"IEEE Computer Architecture Letters\",\"volume\":\"22 1\",\"pages\":\"9-12\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2022-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Computer Architecture Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9984183/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Computer Architecture Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/9984183/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 1

摘要

技术规模的放缓，加上现代机器学习和人工智能模型的指数级增长，产生了对专用加速器的需求，如GPU、ASIC和现场可编程门阵列（FPGA）。FPGA可以重新配置，有可能超越其他加速器，同时也更节能，但在当今工具流支离破碎的环境中使用起来很麻烦。我们提出了操作集架构的概念，通过将现有的专用框架组合到一个有机编译器中，来克服当前使用DNN到FPGA编译器的不兼容性和障碍，该编译器还允许有效和自动地重用现有的社区工具。此外，我们证明，混合不同的现有框架可以将效率提高一个数量级以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Advancing Compilation of DNNs for FPGAs Using Operation Set Architectures

The slow-down of technology scaling combined with the exponential growth of modern machine learning and artificial intelligence models has created a demand for specialized accelerators, such as GPUs, ASICs, and field-programmable gate arrays (FPGAs). FPGAs can be reconfigured and have the potential to outperform other accelerators, while also being more energy-efficient, but are cumbersome to use with today's fractured landscape of tool flows. We propose the concept of an operation set architecture to overcome the current incompatibilities and hurdles in using DNN-to-FPGA compilers by combining existing specialized frameworks into one organic compiler that also allows the efficient and automatic re-use of existing community tools. Furthermore, we demonstrate that mixing different existing frameworks can increase the efficiency by more than an order of magnitude.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Computer Architecture Letters COMPUTER SCIENCE, HARDWARE & ARCHITECTURE-

CiteScore

4.60

自引率

4.30%

发文量

期刊介绍： IEEE Computer Architecture Letters is a rigorously peer-reviewed forum for publishing early, high-impact results in the areas of uni- and multiprocessor computer systems, computer architecture, microarchitecture, workload characterization, performance evaluation and simulation techniques, and power-aware computing. Submissions are welcomed on any topic in computer architecture, especially but not limited to: microprocessor and multiprocessor systems, microarchitecture and ILP processors, workload characterization, performance evaluation and simulation techniques, compiler-hardware and operating system-hardware interactions, interconnect architectures, memory and cache systems, power and thermal issues at the architecture level, I/O architectures and techniques, independent validation of previously published results, analysis of unsuccessful techniques, domain-specific processor architectures (e.g., embedded, graphics, network, etc.), real-time and high-availability architectures, reconfigurable systems.