{"title":"Flexible Hardware Accelerator Design Generation with Spiral","authors":"Guanglin Xu, J. Hoe, F. Franchetti","doi":"10.1109/HPEC55821.2022.9926413","DOIUrl":null,"url":null,"abstract":"Hardware specialization has become a widely employed technique for approaching higher performance and en-ergy efficiency in computer systems. Yet obtaining efficient cus-tom hardware designs remains a challenging and tedious task, calling for the automated approaches. In the past, Spiral has been used for generating high-throughput streaming hardware designs for linear transform kernels. This paper is motivated by an observation that a memory-based iterative computing model may allow us to trade off throughput for algorithmic flexibility. In this paper, we present a hardware generation approach that generates and optimizes algorithms using Spiral's multi-level domain-specific languages (DSLs), targeting a scalar load-store architecture. We have incorporated this approach as a hardware backend into the Spiral system. Our evaluation of this approach on several fundamental kernels shows flexibility with reasonable performance and resource utilization.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"43 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC55821.2022.9926413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Hardware specialization has become a widely employed technique for approaching higher performance and en-ergy efficiency in computer systems. Yet obtaining efficient cus-tom hardware designs remains a challenging and tedious task, calling for the automated approaches. In the past, Spiral has been used for generating high-throughput streaming hardware designs for linear transform kernels. This paper is motivated by an observation that a memory-based iterative computing model may allow us to trade off throughput for algorithmic flexibility. In this paper, we present a hardware generation approach that generates and optimizes algorithms using Spiral's multi-level domain-specific languages (DSLs), targeting a scalar load-store architecture. We have incorporated this approach as a hardware backend into the Spiral system. Our evaluation of this approach on several fundamental kernels shows flexibility with reasonable performance and resource utilization.