Micro-21节目主持人

ACM Sigmicro Newsletter Pub Date : 1989-03-01 DOI:10.1145/378818.378848

Wen-mei W. Hwu

{"title":"Micro-21节目主持人","authors":"Wen-mei W. Hwu","doi":"10.1145/378818.378848","DOIUrl":null,"url":null,"abstract":"The instruction queue is a critical component of the proposed mlcroarchitecture where executable instructions are detected and delivered to the execution unit. This paper clarifies the issue of loading instructions into the instruction queue and evaluates the resulting performance due to different schemes. paths are identified in the complicated UNIX programs so that trace scheduling can be effectively applied. Experimental results are provided for ten UNIX system and CAD programs which all exhibit complicated control structure. This is the first paper to address the issue of applying trace scheduling to complicated programs. The work is critical to adapting trace scheduling to RISC's and other upcoming pipelined, parallel mlcroarchltectures. Research The CMOS 370 has some Control Store on chip and some off. A small on-chip Control Store holds the first two microwords of each microsequence (target of conditional branches). A close look reveals that the two-level Control Store structure can be viewed as a programmer managed target instruction buffer. This structure makes it possible to access one microinstruction from a (mostly off-chip) large Control store every cycle while achieving a short cycle time. Efficient trapping is proposed to support efficient instruction emulation in processors with hardwired control. This makes the issue of instruction set design relatively independent of the implementation (hardwired or microprogrammed). • \"Multiple Instruction Issue and Single-Chip Processors,\" A. Pleszkun and G. Sohi, U. of Wisconsin-Madison. Sometimes issuing multiple instructions is not a win. It would be interesting to experiment on the effect of compilation support (trace scheduling, register allocation, etc.) on the instruction issue rate. Comparing the results presented in this paper and those presented by the VLIW team, compilation support seems to be critical for issuing multiple instructions per cycle. The paper discusses the dilemma due to the interdependence between data routing and code scheduling in ASIC code generation. This issue corresponds closely to the one regarding the code scheduling and register allocation for pipelined and/or wide instruction architectures. The trend is to consider both factors together during code generation. The dynamic reconfigurability is a very interesting feature of the proposed ASIC paradigm. However, the slow prototype makes one wonder if a simple microprocessor can be programmed to achieve the same performance for the target applications. • \"Implementing a Prolog Machine with Multiple Functional Units,\" A. Singhal and Y. Patt, U. C. Berkeley. Parallel unification and execution result in factor of 4 speedup over the Berkeley PLM. …","PeriodicalId":138968,"journal":{"name":"ACM Sigmicro Newsletter","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Micro-21 from the program chair\",\"authors\":\"Wen-mei W. Hwu\",\"doi\":\"10.1145/378818.378848\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The instruction queue is a critical component of the proposed mlcroarchitecture where executable instructions are detected and delivered to the execution unit. This paper clarifies the issue of loading instructions into the instruction queue and evaluates the resulting performance due to different schemes. paths are identified in the complicated UNIX programs so that trace scheduling can be effectively applied. Experimental results are provided for ten UNIX system and CAD programs which all exhibit complicated control structure. This is the first paper to address the issue of applying trace scheduling to complicated programs. The work is critical to adapting trace scheduling to RISC's and other upcoming pipelined, parallel mlcroarchltectures. Research The CMOS 370 has some Control Store on chip and some off. A small on-chip Control Store holds the first two microwords of each microsequence (target of conditional branches). A close look reveals that the two-level Control Store structure can be viewed as a programmer managed target instruction buffer. This structure makes it possible to access one microinstruction from a (mostly off-chip) large Control store every cycle while achieving a short cycle time. Efficient trapping is proposed to support efficient instruction emulation in processors with hardwired control. This makes the issue of instruction set design relatively independent of the implementation (hardwired or microprogrammed). • \\\"Multiple Instruction Issue and Single-Chip Processors,\\\" A. Pleszkun and G. Sohi, U. of Wisconsin-Madison. Sometimes issuing multiple instructions is not a win. It would be interesting to experiment on the effect of compilation support (trace scheduling, register allocation, etc.) on the instruction issue rate. Comparing the results presented in this paper and those presented by the VLIW team, compilation support seems to be critical for issuing multiple instructions per cycle. The paper discusses the dilemma due to the interdependence between data routing and code scheduling in ASIC code generation. This issue corresponds closely to the one regarding the code scheduling and register allocation for pipelined and/or wide instruction architectures. The trend is to consider both factors together during code generation. The dynamic reconfigurability is a very interesting feature of the proposed ASIC paradigm. However, the slow prototype makes one wonder if a simple microprocessor can be programmed to achieve the same performance for the target applications. • \\\"Implementing a Prolog Machine with Multiple Functional Units,\\\" A. Singhal and Y. Patt, U. C. Berkeley. Parallel unification and execution result in factor of 4 speedup over the Berkeley PLM. …\",\"PeriodicalId\":138968,\"journal\":{\"name\":\"ACM Sigmicro Newsletter\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1989-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Sigmicro Newsletter\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/378818.378848\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Sigmicro Newsletter","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/378818.378848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

指令队列是所建议的mlcroo体系结构的关键组件，在其中检测可执行指令并将其传递给执行单元。本文阐明了将指令加载到指令队列中的问题，并对不同方案所产生的性能进行了评估。在复杂的UNIX程序中识别路径，以便有效地应用跟踪调度。给出了10种控制结构复杂的UNIX系统和CAD程序的实验结果。这是第一篇讨论将跟踪调度应用于复杂程序的论文。这项工作对于使跟踪调度适应RISC和其他即将到来的流水线、并行多体系结构至关重要。CMOS 370芯片上有控制存储，也有控制存储。一个小的片上控制存储器保存每个微序列的前两个微字(条件分支的目标)。仔细观察可以发现，两层控制存储结构可以看作是程序员管理的目标指令缓冲区。这种结构使得每个周期从(大部分是片外)大型控制存储访问一个微指令成为可能，同时实现了较短的周期时间。为了在硬连线控制的处理器中支持有效的指令仿真，提出了有效的捕获方法。这使得指令集设计问题相对独立于实现(硬连接或微编程)。•“多指令问题和单芯片处理器”，A. Pleszkun和G. Sohi，威斯康星大学麦迪逊分校。有时发出多个指令并不是一件好事。测试编译支持(跟踪调度、寄存器分配等)对指令发放率的影响会很有趣。比较本文中给出的结果和VLIW团队给出的结果，编译支持似乎对每个周期发出多个指令至关重要。本文讨论了在ASIC代码生成中由于数据路由和代码调度相互依赖而产生的困境。这个问题与流水线和/或宽指令体系结构的代码调度和寄存器分配密切相关。趋势是在代码生成过程中同时考虑这两个因素。动态可重构性是所提出的ASIC范式的一个非常有趣的特征。然而，缓慢的原型使人怀疑是否可以对一个简单的微处理器进行编程以实现目标应用程序的相同性能。•“实现具有多个功能单元的Prolog机器”，a . Singhal和Y. Patt, uc Berkeley。并行统一和执行导致比伯克利PLM的4倍的加速。．．．

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Micro-21 from the program chair

The instruction queue is a critical component of the proposed mlcroarchitecture where executable instructions are detected and delivered to the execution unit. This paper clarifies the issue of loading instructions into the instruction queue and evaluates the resulting performance due to different schemes. paths are identified in the complicated UNIX programs so that trace scheduling can be effectively applied. Experimental results are provided for ten UNIX system and CAD programs which all exhibit complicated control structure. This is the first paper to address the issue of applying trace scheduling to complicated programs. The work is critical to adapting trace scheduling to RISC's and other upcoming pipelined, parallel mlcroarchltectures. Research The CMOS 370 has some Control Store on chip and some off. A small on-chip Control Store holds the first two microwords of each microsequence (target of conditional branches). A close look reveals that the two-level Control Store structure can be viewed as a programmer managed target instruction buffer. This structure makes it possible to access one microinstruction from a (mostly off-chip) large Control store every cycle while achieving a short cycle time. Efficient trapping is proposed to support efficient instruction emulation in processors with hardwired control. This makes the issue of instruction set design relatively independent of the implementation (hardwired or microprogrammed). • "Multiple Instruction Issue and Single-Chip Processors," A. Pleszkun and G. Sohi, U. of Wisconsin-Madison. Sometimes issuing multiple instructions is not a win. It would be interesting to experiment on the effect of compilation support (trace scheduling, register allocation, etc.) on the instruction issue rate. Comparing the results presented in this paper and those presented by the VLIW team, compilation support seems to be critical for issuing multiple instructions per cycle. The paper discusses the dilemma due to the interdependence between data routing and code scheduling in ASIC code generation. This issue corresponds closely to the one regarding the code scheduling and register allocation for pipelined and/or wide instruction architectures. The trend is to consider both factors together during code generation. The dynamic reconfigurability is a very interesting feature of the proposed ASIC paradigm. However, the slow prototype makes one wonder if a simple microprocessor can be programmed to achieve the same performance for the target applications. • "Implementing a Prolog Machine with Multiple Functional Units," A. Singhal and Y. Patt, U. C. Berkeley. Parallel unification and execution result in factor of 4 speedup over the Berkeley PLM. …

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Sigmicro Newsletter

自引率

0.00%

发文量