Micro-architecture Pipelining Optimization with Throughput-Aware Floorplanning

Yuchun Ma, Zhuoyuan Li, J. Cong, Xianlong Hong, Glenn D. Reinman, Sheqin Dong, Qiang Zhou
{"title":"Micro-architecture Pipelining Optimization with Throughput-Aware Floorplanning","authors":"Yuchun Ma, Zhuoyuan Li, J. Cong, Xianlong Hong, Glenn D. Reinman, Sheqin Dong, Qiang Zhou","doi":"10.1109/ASPDAC.2007.358107","DOIUrl":null,"url":null,"abstract":"For modern processor designs in nanometer technologies, both block and interconnect pipelining are needed to achieve multi-gigahertz clock frequency, but previous approaches consider block pipelining and interconnect pipelining separately. For example, all recent works on wire pipelining assume pre-pipelined components and consider only inserting pipeline stages on point-to-point wire or bus connections. To the best of our knowledge, this paper is the first that considers block pipelining and interconnect pipelining simultaneously. We optimize multiple critical paths or loops in the micro-architecture and insert the pipelines stages optimally in the blocks and wires of these loops to meet the clock frequency requirement. We propose two approaches to this problem. The first approach is based on mixed integer linear programming (MILP) which is theoretically guaranteed to produce the optimal solution, and the second one is an efficient graph-based algorithm that produces near-optimal solutions. Experimental results show that simultaneous block and interconnect pipelining leads to more than 20% improvement over wire-pipelining alone on the overall processor performance. Moreover, the graph-based approach gives solutions very close to the MILP results ( 2% more than MILP results on average) but in a much shorter runtime.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 Asia and South Pacific Design Automation Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASPDAC.2007.358107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

For modern processor designs in nanometer technologies, both block and interconnect pipelining are needed to achieve multi-gigahertz clock frequency, but previous approaches consider block pipelining and interconnect pipelining separately. For example, all recent works on wire pipelining assume pre-pipelined components and consider only inserting pipeline stages on point-to-point wire or bus connections. To the best of our knowledge, this paper is the first that considers block pipelining and interconnect pipelining simultaneously. We optimize multiple critical paths or loops in the micro-architecture and insert the pipelines stages optimally in the blocks and wires of these loops to meet the clock frequency requirement. We propose two approaches to this problem. The first approach is based on mixed integer linear programming (MILP) which is theoretically guaranteed to produce the optimal solution, and the second one is an efficient graph-based algorithm that produces near-optimal solutions. Experimental results show that simultaneous block and interconnect pipelining leads to more than 20% improvement over wire-pipelining alone on the overall processor performance. Moreover, the graph-based approach gives solutions very close to the MILP results ( 2% more than MILP results on average) but in a much shorter runtime.
基于吞吐量感知的微架构流水线优化
对于采用纳米技术的现代处理器设计,为了实现多千兆赫时钟频率,既需要块流水线又需要互连流水线,但以前的方法分别考虑块流水线和互连流水线。例如,最近所有关于有线管道的工作都假设预先管道化的组件,并且只考虑在点对点的电线或总线连接上插入管道级。据我们所知,本文是第一个同时考虑块管道和互连管道的研究。我们优化了微架构中的多个关键路径或环路,并将管道级最佳地插入这些环路的块和导线中,以满足时钟频率要求。我们对这个问题提出了两种解决方法。第一种方法是基于混合整数线性规划(MILP),理论上保证产生最优解;第二种方法是一种高效的基于图的算法,可以产生近最优解。实验结果表明,在整体处理器性能上,同时采用块和互连流水线比单独采用有线流水线提高20%以上。此外,基于图的方法提供的解决方案非常接近MILP结果(比MILP结果平均高出2%),但运行时间要短得多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信