{"title":"Buffer Allocation for Exposed Datapath Architectures","authors":"Anoop Bhagyanath, K. Schneider","doi":"10.1109/MCSoC57363.2022.00013","DOIUrl":null,"url":null,"abstract":"Concurrent access to a given number of registers limits the instruction-level parallelism (ILP) used by conventional processors despite the use of many processing units (PUs). Many recent architectures expose their internal datapaths to compilers, allowing the compiler to move intermediate values from program execution directly between PUs, thus bypassing the use of registers. Buffered exposed datapath (BED) architectures additionally implement these inter-PU communication paths with scalable first-in-first-out (FIFO) buffers to avoid the use of registers and to prevent unnecessary synchronization between PUs. However, the BED compiler must ensure that the creation order of intermediate values in a buffer matches their consumption order so that the next executing instructions always find their operands at the heads of the corresponding buffers. In this paper, we present a novel buffer interference analysis that determines a criterion for allocating multiple program variables to the same buffer based on a given instruction schedule that specifies an access order for those variables. We then use the well-known dataflow analysis framework to compute a buffer interference graph whose coloring yields a valid buffer allocation for programs by considering the instructions in the given order. Preliminary experimental results show the effectiveness of our code generation approach compared to traditional register-based compilation. More importantly, the buffer interference graph should serve as the basis for future buffer allocation schemes that maximize ILP usage.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MCSoC57363.2022.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Concurrent access to a given number of registers limits the instruction-level parallelism (ILP) used by conventional processors despite the use of many processing units (PUs). Many recent architectures expose their internal datapaths to compilers, allowing the compiler to move intermediate values from program execution directly between PUs, thus bypassing the use of registers. Buffered exposed datapath (BED) architectures additionally implement these inter-PU communication paths with scalable first-in-first-out (FIFO) buffers to avoid the use of registers and to prevent unnecessary synchronization between PUs. However, the BED compiler must ensure that the creation order of intermediate values in a buffer matches their consumption order so that the next executing instructions always find their operands at the heads of the corresponding buffers. In this paper, we present a novel buffer interference analysis that determines a criterion for allocating multiple program variables to the same buffer based on a given instruction schedule that specifies an access order for those variables. We then use the well-known dataflow analysis framework to compute a buffer interference graph whose coloring yields a valid buffer allocation for programs by considering the instructions in the given order. Preliminary experimental results show the effectiveness of our code generation approach compared to traditional register-based compilation. More importantly, the buffer interference graph should serve as the basis for future buffer allocation schemes that maximize ILP usage.