Nikola Kovačević, Đorđe Mišeljić, Aleksa Stojković
{"title":"RISC-V vector processor for acceleration of machine learning algorithms","authors":"Nikola Kovačević, Đorđe Mišeljić, Aleksa Stojković","doi":"10.1109/TELFOR56187.2022.9983779","DOIUrl":null,"url":null,"abstract":"In this paper we present an RTL implementation of a 32-bit parametrizable vector processor for acceleration of algorithms working in fixed-point arithmetic. The processor uses the latest RISC-V vector extension ISA specification and is deployed and tested on a Zynq Soc using Avnet Zedboard. Our microarchitecture exploits the inherent parallelism in algorithms by splitting execution across multiple vector lanes and enabling chaining of vector instructions. To provide the required number of read/write ports for instruction chaining, the vector register bank uses the double-pumping technique in combination with an XOR-based approach. First, the microarchitecture of the system is explained in detail, and the results of the implementation on the Zedboard are presented for some different processor configurations. We then compared the performance of the implemented design with some different modern processor cores.","PeriodicalId":277553,"journal":{"name":"2022 30th Telecommunications Forum (TELFOR)","volume":"299 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 30th Telecommunications Forum (TELFOR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TELFOR56187.2022.9983779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In this paper we present an RTL implementation of a 32-bit parametrizable vector processor for acceleration of algorithms working in fixed-point arithmetic. The processor uses the latest RISC-V vector extension ISA specification and is deployed and tested on a Zynq Soc using Avnet Zedboard. Our microarchitecture exploits the inherent parallelism in algorithms by splitting execution across multiple vector lanes and enabling chaining of vector instructions. To provide the required number of read/write ports for instruction chaining, the vector register bank uses the double-pumping technique in combination with an XOR-based approach. First, the microarchitecture of the system is explained in detail, and the results of the implementation on the Zedboard are presented for some different processor configurations. We then compared the performance of the implemented design with some different modern processor cores.