Architectural considerations for application-specific counterflow pipelines

B. Childers, J. Davidson
{"title":"Architectural considerations for application-specific counterflow pipelines","authors":"B. Childers, J. Davidson","doi":"10.1109/ARVLSI.1999.756034","DOIUrl":null,"url":null,"abstract":"Application-specific processor design is a promising approach for meeting the performance and cost goals of a system. Application-specific processors are especially promising for embedded systems (e.g., digital cameras, cellular phones, etc.) where a small increase in performance and decrease in cost can have a large impact on a product's viability. Sproull, Sutherland and Molnar (see IEEE Design and Test of Computers, vol. 11, no. 3, p. 48-59, 1994) have proposed a new pipeline organization called the Counterflow Pipeline (CFP). This paper evaluates CFP design alternatives and shows that the CFP is an ideal architecture for fast, low-cost design of high-performance processors customized for computation-intensive embedded applications. First, we describe why CFP's are particularly well-suited to realizing application-specific processors. Second we describe how a CFP tailored to an application can be constructed automatically. Third, we present measurements that evaluate CFP design trade-offs and show that CFP's provide speculative and out-of-order execution, and register renaming that is matched to an application. Fourth, we show that asynchronous counterflow pipelines achieve high-performance by reducing the average execution latency of instructions over synchronous implementations. Finally, we demonstrate that custom CFP's achieve cycles per instruction measurements that are competitive with 4-way superscalar out-of-order processors at a potentially low design complexity.","PeriodicalId":358015,"journal":{"name":"Proceedings 20th Anniversary Conference on Advanced Research in VLSI","volume":"93 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 20th Anniversary Conference on Advanced Research in VLSI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARVLSI.1999.756034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Application-specific processor design is a promising approach for meeting the performance and cost goals of a system. Application-specific processors are especially promising for embedded systems (e.g., digital cameras, cellular phones, etc.) where a small increase in performance and decrease in cost can have a large impact on a product's viability. Sproull, Sutherland and Molnar (see IEEE Design and Test of Computers, vol. 11, no. 3, p. 48-59, 1994) have proposed a new pipeline organization called the Counterflow Pipeline (CFP). This paper evaluates CFP design alternatives and shows that the CFP is an ideal architecture for fast, low-cost design of high-performance processors customized for computation-intensive embedded applications. First, we describe why CFP's are particularly well-suited to realizing application-specific processors. Second we describe how a CFP tailored to an application can be constructed automatically. Third, we present measurements that evaluate CFP design trade-offs and show that CFP's provide speculative and out-of-order execution, and register renaming that is matched to an application. Fourth, we show that asynchronous counterflow pipelines achieve high-performance by reducing the average execution latency of instructions over synchronous implementations. Finally, we demonstrate that custom CFP's achieve cycles per instruction measurements that are competitive with 4-way superscalar out-of-order processors at a potentially low design complexity.
特定于应用程序的逆流管道的体系结构考虑
特定于应用程序的处理器设计是满足系统性能和成本目标的一种很有前途的方法。应用专用处理器对于嵌入式系统(例如,数码相机、移动电话等)尤其有前景,在这些系统中,性能的小幅提高和成本的降低会对产品的生存能力产生很大的影响。史普罗,萨瑟兰和莫尔纳(见IEEE计算机设计与测试,第11卷,第11期)。3, p. 48-59, 1994)提出了一种新的管道组织,称为逆流管道(CFP)。本文评估了CFP设计方案,并表明CFP是一种理想的架构,可以快速、低成本地为计算密集型嵌入式应用定制高性能处理器。首先,我们描述了为什么CFP特别适合于实现特定于应用程序的处理器。其次,我们描述了为应用程序量身定制的CFP如何自动构建。第三,我们提出了评估CFP设计权衡的测量方法,并表明CFP提供推测性和乱序执行,以及与应用程序匹配的注册重命名。第四,我们展示了异步逆流管道通过减少指令在同步实现上的平均执行延迟来实现高性能。最后,我们证明了定制CFP在潜在的低设计复杂性下实现了与4路超标量乱序处理器竞争的每指令周期测量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信