Matteo Perotti, Matheus A. Cavalcante, Nils Wistoff, Renzo Andri, L. Cavigelli, L. Benini
{"title":"A “New Ara” for Vector Computing: An Open Source Highly Efficient RISC-V V 1.0 Vector Processor Design","authors":"Matteo Perotti, Matheus A. Cavalcante, Nils Wistoff, Renzo Andri, L. Cavigelli, L. Benini","doi":"10.1109/ASAP54787.2022.00017","DOIUrl":null,"url":null,"abstract":"Vector architectures are gaining traction for highly efficient processing of data-parallel workloads, driven by all major ISAs (RISC-V, Arm, Intel), and boosted by landmark chips, like the Arm SVE-based Fujitsu A64FX, powering the TOP500 leader Fugaku. The RISC-V V extension has recently reached 1.0-Frozen status. Here, we present its first open-source implementation, discuss the new specification's impact on the micro-architecture of a lane-based design, and provide insights on performance-oriented design of coupled scalar-vector processors. Our system achieves comparable/better PPA than state-of-the-art vector engines that implement older RVV versions: 15% better area, 6% improved throughput, and FPU utilization >98.5% on crucial kernels.","PeriodicalId":207871,"journal":{"name":"2022 IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP54787.2022.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Vector architectures are gaining traction for highly efficient processing of data-parallel workloads, driven by all major ISAs (RISC-V, Arm, Intel), and boosted by landmark chips, like the Arm SVE-based Fujitsu A64FX, powering the TOP500 leader Fugaku. The RISC-V V extension has recently reached 1.0-Frozen status. Here, we present its first open-source implementation, discuss the new specification's impact on the micro-architecture of a lane-based design, and provide insights on performance-oriented design of coupled scalar-vector processors. Our system achieves comparable/better PPA than state-of-the-art vector engines that implement older RVV versions: 15% better area, 6% improved throughput, and FPU utilization >98.5% on crucial kernels.