On Memory Optimal Code Generation for Exposed Datapath Architectures with Buffered Processing Units

Markus Anders, Anoop Bhagyanath, K. Schneider
{"title":"On Memory Optimal Code Generation for Exposed Datapath Architectures with Buffered Processing Units","authors":"Markus Anders, Anoop Bhagyanath, K. Schneider","doi":"10.1109/ACSD.2018.00020","DOIUrl":null,"url":null,"abstract":"One reason for the limited use of instruction level parallelism (ILP) by conventional processors is their use of registers. Therefore, some recent processor architectures expose their datapaths to the compiler so that the compiler can move values directly between processing units. In particular, the Synchronous Control Asynchronous Dataflow (SCAD) machine is an exposed datapath architecture that uses FIFO buffers at the input and output ports of its processing units. Code generation techniques inspired by classic queue machines can completely eliminate the use of conventional registers in SCAD. However, bounded buffer sizes may still make spill code necessary to store values temporarily in main memory. Since memory access is expensive, it has to be avoided to improve the execution time of programs. Memory optimal code generation problems have been extensively studied in the case of register machines and were proven to be NP-complete. In this paper, we prove that memory optimal code generation for SCAD is also NP-complete by presenting a polynomial-time transformation from memory optimal register code to memory optimal SCAD code. In particular, we present a one to one correspondence between the registers in register machines and the entries of buffers in SCAD machines which indicates that these architectures are closer to each other than expected. Still, SCAD machines offer important advantages: The size of circuit implementations of buffers scales much better compared to register files so that more space is available on SCAD machines with the same chip size. Second, the instruction set of SCAD does not depend on a fixed number of registers or buffers. We therefore present experimental results to compare the execution time of memory optimal SCAD code with FIFO buffers and memory optimal code based on conventional register allocation.","PeriodicalId":242721,"journal":{"name":"2018 18th International Conference on Application of Concurrency to System Design (ACSD)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 18th International Conference on Application of Concurrency to System Design (ACSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSD.2018.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

One reason for the limited use of instruction level parallelism (ILP) by conventional processors is their use of registers. Therefore, some recent processor architectures expose their datapaths to the compiler so that the compiler can move values directly between processing units. In particular, the Synchronous Control Asynchronous Dataflow (SCAD) machine is an exposed datapath architecture that uses FIFO buffers at the input and output ports of its processing units. Code generation techniques inspired by classic queue machines can completely eliminate the use of conventional registers in SCAD. However, bounded buffer sizes may still make spill code necessary to store values temporarily in main memory. Since memory access is expensive, it has to be avoided to improve the execution time of programs. Memory optimal code generation problems have been extensively studied in the case of register machines and were proven to be NP-complete. In this paper, we prove that memory optimal code generation for SCAD is also NP-complete by presenting a polynomial-time transformation from memory optimal register code to memory optimal SCAD code. In particular, we present a one to one correspondence between the registers in register machines and the entries of buffers in SCAD machines which indicates that these architectures are closer to each other than expected. Still, SCAD machines offer important advantages: The size of circuit implementations of buffers scales much better compared to register files so that more space is available on SCAD machines with the same chip size. Second, the instruction set of SCAD does not depend on a fixed number of registers or buffers. We therefore present experimental results to compare the execution time of memory optimal SCAD code with FIFO buffers and memory optimal code based on conventional register allocation.
带缓冲处理单元的暴露数据路径体系结构的内存最优代码生成
传统处理器对指令级并行性(ILP)使用有限的一个原因是它们对寄存器的使用。因此,一些最新的处理器体系结构向编译器公开了它们的数据路径,以便编译器可以直接在处理单元之间移动值。特别是,同步控制异步数据流(SCAD)机器是一种公开的数据路径架构,它在其处理单元的输入和输出端口使用FIFO缓冲区。受经典队列机启发的代码生成技术可以完全消除SCAD中传统寄存器的使用。然而,有限的缓冲区大小可能仍然需要溢出代码来临时将值存储在主存中。由于内存访问是昂贵的,必须避免它,以提高程序的执行时间。在寄存器机的情况下,内存最优代码生成问题已经得到了广泛的研究,并被证明是np完全的。在本文中,我们通过提出一个从内存最优寄存器码到内存最优SCAD码的多项式时间变换,证明了SCAD的内存最优码生成也是np完全的。特别是,我们提出了寄存器机中的寄存器和SCAD机中的缓冲区条目之间的一对一对应关系,这表明这些体系结构比预期的更接近彼此。尽管如此,SCAD机器仍然提供了重要的优势:与寄存器文件相比,缓冲电路实现的大小可以更好地扩展,因此在具有相同芯片大小的SCAD机器上可以使用更多的空间。其次,SCAD的指令集不依赖于固定数量的寄存器或缓冲区。因此,我们提出了实验结果来比较内存最优SCAD代码与FIFO缓冲区和基于传统寄存器分配的内存最优代码的执行时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信