[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21最新文献

Hardware Support For Large Atomic Units in Dynamically Scheduled Machines 动态调度机器中大型原子单元的硬件支持

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1145/62504.62535

S. Melvin, M. Shebanow, Y. Patt

{"title":"Hardware Support For Large Atomic Units in Dynamically Scheduled Machines","authors":"S. Melvin, M. Shebanow, Y. Patt","doi":"10.1145/62504.62535","DOIUrl":"https://doi.org/10.1145/62504.62535","url":null,"abstract":"Microarchitectures that implement conventional instruction set architectures are usually limited in that they are only able to execute a small number of microoperations concurrently. This limitation is due in part to the fact that the units of work that the hardware treats as indivisible are small. While this limitation is not important for microarchitectures with a low level of functionality, it can be significant if the goal is to build hardware that can support a large number of microoperations executing concurrently. In this paper we address the tradeoffs associated with the sizes of the various units of work that a processor considers indivisible, or atomic. We argue that by allowing larger units of work to be atomic, restrictions on concurrent operation are reduced and performance is increased. We outline the implementation of a front end for a dynamically scheduled processor with hardware support for large atomic units. We discuss tradeoffs in the design and show that with a modest investment in hardware, the run-time advantages of large atomic units can be realized without the need to alter the instruction set architecture.","PeriodicalId":378625,"journal":{"name":"[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116959562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 87

Modelling The Effects Of Instruction Queue Loading On A Static Instruction Stream Micro-architecture 静态指令流微体系结构中指令队列加载影响的建模

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1145/62504.62509

J. Jacobs, A. Uht, R. C. Ord

引用次数: 2

Organization Of Array Data For Concurrent Memory Access 用于并发内存访问的数组数据的组织

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1145/62504.62672

M. Breternitz, John Paul Shen

引用次数: 10

Microprogramming In A Multiprocessor Data Acquisition System 多处理器数据采集系统的微程序设计

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1109/MICRO.1988.639272

S. D'Angelo, L. Lisca, A. Proserpio, G. Sechi

引用次数: 2

Lazy Data Routing And Greedy Scheduling For Application-specific Signal Processors 针对特定应用的信号处理器的延迟数据路由和贪婪调度

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1145/62504.62676

K. Rimey, P. Hilfinger

{"title":"Lazy Data Routing And Greedy Scheduling For Application-specific Signal Processors","authors":"K. Rimey, P. Hilfinger","doi":"10.1145/62504.62676","DOIUrl":"https://doi.org/10.1145/62504.62676","url":null,"abstract":"This paper concerns code generation for a troublesome class of horizontal-instruction-word architectures (whose machine language resembles horizontal microcode). These are application-specifrcprocessors, minimalistic programmable processors to be incorporated into application-specific signal processing chips. The processors of interest afford some opportunity for pipelined and for parallel operation of functional units, but do not provide enough bandwidth to store intermediate results in memory or in a register file. Instead, a typical datapath provides direct connections between functional units (often through pipeline registers), forming an irregular network. The usual way to generate horizontal code is to fist generate a loose sequence of microoperations (vertical code) and then pack these tightly into instructions in a compaction post-pass. Local compaction, which packs one straight-line code segment at a time, is now well-understood; theresearch community has largely shifted its attention to global compaction. For our application-specific processors, however, packing microoperations in a separate pass works poorly and generating good horizontal code for even straight-line code segments presents a challenge. Not only must the code generator choose which functional units to use; it must also choose how to route each intermediate result from the output of one functional unit to the input of another. This task is called data routing. How best to route a particular value depends on the time interval between its definition and use or uses, as well as on the datapath resources that are free during that interval. For this reason we abandon the compaction post-pass, and instead pack or schedule microoperations as they are generated. We consider only local scheduling in this paper. Our local scheduler is similar to the “operation scheduler” developed by Fisher et al. [l] for use in a trace-scheduling compiler for a VLIW supercomputer. However, we consider machines in which intermediate results must often reside in hot spots such as busses and latches as well as registers that would obstruct computation if tied up. Like Fisher et al.,","PeriodicalId":378625,"journal":{"name":"[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115008580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

The Trap As A Control Flow Mechanism 陷阱作为一种控制流机制

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1145/62504.62527

J. A. Chandross, H. Jagadish, A. Asthana

引用次数: 2

A High-speed Hardware Unit For A Subset of Logic Resolution 用于逻辑分辨率子集的高速硬件单元

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1145/62504.62542

D. Wong

引用次数: 0

Implementing a Prolog Machine with Multiple Functional Units 具有多个功能单元的Prolog机的实现

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1145/62504.62517

A. Singhal, Y. Patt

引用次数: 1

Data Dependency Graph Bracing 数据依赖图支撑

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1145/62504.62670

V. Allan

{"title":"Data Dependency Graph Bracing","authors":"V. Allan","doi":"10.1145/62504.62670","DOIUrl":"https://doi.org/10.1145/62504.62670","url":null,"abstract":"The Sunburst compiler refined at Utah State University employs a powerful mechanism for management of data anti-dependencies in data dependency graphs, <italic>DDG's</italic>: the <italic>DDG Bracer</italic>. The term <italic>bracing</italic><supscrpt>1</supscrpt> is used to mean the fastening of two or more parts together. There are two major goals in bracing: 1) semantic correctness, and 2) creation of an optimal DDG. Bracing provides necessary joining of code fragments, produced by a divide and conquer code generation algorithm, while yielding multiple code sequences.\u0000Since no anti-dependency arcs are present, the input <italic>DDG's</italic> are said to be in <italic>normal form</italic>. Because anti-dependency arcs occur only when a resource must be reused, a <italic>DDG</italic> in normal form represents infinite resources. The output <italic>DDG</italic> is a merging of the two input <italic>DDG's</italic> such that data dependency arcs between the two <italic>DDG's</italic> are inserted and data anti-dependency arcs are added to sequentialize the use of common resources.\u0000Vegdahl [Veg82] was one of the first to recognize the importance of live track manipulation. A <italic>live track</italic> is an ordered pair: the first component is the microoperation node ( MO) in which a resource is born, and the second component is the set of nodes in which the resource dies.","PeriodicalId":378625,"journal":{"name":"[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114820590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Trace Selection For Compiling Large C Application Programs To Microcode 将大型C应用程序编译为微码的跟踪选择

[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 Pub Date : 1988-01-03 DOI: 10.1145/62504.62511

P. Chang, W. Hwu

{"title":"Trace Selection For Compiling Large C Application Programs To Microcode","authors":"P. Chang, W. Hwu","doi":"10.1145/62504.62511","DOIUrl":"https://doi.org/10.1145/62504.62511","url":null,"abstract":"Microcode optimization techniques such as code scheduling and resource allocation can benefit significantly by reducing uncertainties in program control flow. A trace selection algorithm with profiling information reduces the uncertainties in program control flow by identifying sequences of frequently invoked basic blocks as traces. These traces are treated as sequential codes for optimization purposes. Optimization based on traces is especially useful when the code size is large and the control structure is complicated enough to defeat hand optimizations. However, most of the experimental results reported to date are based on small benchmarks with simple control structures.\u0000For different trace selection algorithms, we report the distribution of control transfers categorized according to their potential impact on the microcode optimizations. The experimental results are based on ten C application programs which exhibit large code size and complicated control structure. The measured data for each program is accumulated across a large number of input files to ensure the reliability of the result. All experiments are performed automatically using our IMPACT C compiler which contains integrated profiling and analysis tools.","PeriodicalId":378625,"journal":{"name":"[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128245904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 97