Samuel Riedel, Fabian Schuiki, Paul Scheffler, Florian Zaruba, L. Benini
{"title":"Banshee: A Fast LLVM-Based RISC-V Binary Translator","authors":"Samuel Riedel, Fabian Schuiki, Paul Scheffler, Florian Zaruba, L. Benini","doi":"10.1109/ICCAD51958.2021.9643546","DOIUrl":null,"url":null,"abstract":"System simulators are essential for the exploration, evaluation, and verification of manycore processors and are vital for writing software and developing programming models in conjunction with architecture design. A promising approach to fast, scalable, and instruction-accurate simulation is binary translation. In this paper, we present Banshee, an instruction-accurate full-system RISC-V multi-core simulator based on LLVM-powered ahead-of-time binary translation that can simulate systems with thousands of cores. Banshee supports the RV32IMAFD instruction set. It also models peripherals, custom ISA extensions, and a multi-level, actively-managed memory hierarchy used in existing multi-cluster systems. Banshee is agnostic to the host architecture, fully open-source, and easily extensible to facilitate the exploration and evaluation of new ISA extensions. As a key novelty with respect to existing binary translation approaches, Banshee supports performance estimation through a lightweight extension, modeling the effect of architectural latencies with an average deviation of only 2 % from their actual impact. We evaluate Banshee by simulating various compute-intensive workloads on two large-scale open-source RISC-V manycore systems, Manticore and MemPool (with 4096 and 256 cores, respectively). We achieve simulation speeds of up to 618 MIPS per core or 72 GIPS for complete systems, exhibiting almost perfect scaling, competitive single-core performance, and leading multi-core performance. We demonstrate Banshee's extensibility by implementing multiple custom RISC-V ISA extensions.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAD51958.2021.9643546","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
System simulators are essential for the exploration, evaluation, and verification of manycore processors and are vital for writing software and developing programming models in conjunction with architecture design. A promising approach to fast, scalable, and instruction-accurate simulation is binary translation. In this paper, we present Banshee, an instruction-accurate full-system RISC-V multi-core simulator based on LLVM-powered ahead-of-time binary translation that can simulate systems with thousands of cores. Banshee supports the RV32IMAFD instruction set. It also models peripherals, custom ISA extensions, and a multi-level, actively-managed memory hierarchy used in existing multi-cluster systems. Banshee is agnostic to the host architecture, fully open-source, and easily extensible to facilitate the exploration and evaluation of new ISA extensions. As a key novelty with respect to existing binary translation approaches, Banshee supports performance estimation through a lightweight extension, modeling the effect of architectural latencies with an average deviation of only 2 % from their actual impact. We evaluate Banshee by simulating various compute-intensive workloads on two large-scale open-source RISC-V manycore systems, Manticore and MemPool (with 4096 and 256 cores, respectively). We achieve simulation speeds of up to 618 MIPS per core or 72 GIPS for complete systems, exhibiting almost perfect scaling, competitive single-core performance, and leading multi-core performance. We demonstrate Banshee's extensibility by implementing multiple custom RISC-V ISA extensions.