MICRO 24最新文献_第2页

GURPR*: a new global software pipelining algorithm GURPR*:一种新的全局软件流水线算法

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123509

B. Su, Jian Wang

引用次数: 46

Two-level adaptive training branch prediction 两级自适应训练分支预测

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123475

Tse-Yu Yeh, Y. Patt

{"title":"Two-level adaptive training branch prediction","authors":"Tse-Yu Yeh, Y. Patt","doi":"10.1145/123465.123475","DOIUrl":"https://doi.org/10.1145/123465.123475","url":null,"abstract":"High-performance microarchitectures use, among other structures, deep pipelines to help speed up exe- cution. The importance of a good branch predictor to the effectiveness of a deep pipeline in the presence of condi- tional branches is well-known. In fact, the literature contains proposals for a number of branch prediction schemes. Some are static in that they use opcode information and profiling statistics to make predictions. Others are dynamic in that they use run-time execution history to make predictions. This paper proposes a new dynamic branch predictor, the Two-Level Adaptive Paining scheme, which alters the branch prediction algorithm on the basis of information collected at run-time. Several configurations of the Two-Level Adaptive Training Branch Predictor are introduced, simulated, and compared to simulations of other known static and dynamic branch prediction schemes. Two-Level Adaptive Training Branch Prediction achieves 97 percent accuracy on nine of the ten SPEC benchmarks, compared to less than 93 percent for other schemes. Since a prediction miss requires flushing of the speculative execution already in progress, the relevant metric is the miss rate. The miss rate is 3 percent for the Two-Level Adaptive Training scheme vs. 7 percent (best case) for the other schemes. This represents more than a 100 percent improvement in reducing the number of pipeline hushes required.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"2 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120842031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 526

Code duplication: an assist for global instruction scheduling 代码复制:一个辅助全局指令调度

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123486

D. Bernstein, D. Cohen, H. Krawczyk

引用次数: 34

GRIP: graphics reduced instruction processor GRIP:图形简化指令处理器

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123495

G. Singh

{"title":"GRIP: graphics reduced instruction processor","authors":"G. Singh","doi":"10.1145/123465.123495","DOIUrl":"https://doi.org/10.1145/123465.123495","url":null,"abstract":"A B S TRA C E This paper presents an original approach for designing a new 2D graphics processor. The paradigm of the Reduced Instruction Set Computers (RISC) is applied in the design of thk graphics processor. First, a set of 2D graphics operations that are commonly encountered in graphics processing are delineated. Next, these operations are evaluated from the perspective of implementing them in a RISC style, single cycle, pipelined execution. Subsequently, the GRIP datapath is designed with the objective of implementing both the commonly encountered general purpose operations as well as the fundamental graphics operations. The motivation behind the integrated design of a general purpose processor with graphics capability is the author’s belief in an increasing role of graphics and window based interfaces for the ergonomicsgeared software applications of the future. Such an integration also results in a reduction of the system development and integration cost. The paper demonstrates that the RISC based approach to design of a processor with graphics capability also results in considerable performance improvements, in addition to the elimination of the communication delays. The net result of adopting the proposed approach is thus a polynomial increase in the Performance/Cost ratio, compared to a system incorporating a separate 2D graphics co-processor.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128709410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Genetic algorithms and instruction scheduling 遗传算法与指令调度

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123507

S. Beaty

引用次数: 41

Comparing static and dynamic code scheduling for multiple-instruction-issue processors 比较多指令问题处理器的静态和动态代码调度

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123471

P. Chang, William Y. Chen, S. Mahlke, Wen-mei W. Hwu

{"title":"Comparing static and dynamic code scheduling for multiple-instruction-issue processors","authors":"P. Chang, William Y. Chen, S. Mahlke, Wen-mei W. Hwu","doi":"10.1145/123465.123471","DOIUrl":"https://doi.org/10.1145/123465.123471","url":null,"abstract":"This paper examines two alternative approaches to supporting code scheduling for multiple-instruction-issue processors. One is to provide a set of non-trapping instructions so that the compiler can perform aggressive static code scheduling. The application of this approach to existing commercial architectures typically requires extending the instruction set. The other approach is to support out-of-order execution in the microarchitecture so that the hardware can perform aggressive dynamic code scheduling. This approach usually does not require modifying the instruction set but requires complex hardware support. In this paper, we analyze the performance of the two alternative approaches using a set of important nonnumerical C benchmark programs. A distinguishing feature of the experiment is that the code for the dynamic approach has been optimized and scheduled as much as allowed by the architecture. The hardware is only responsible for the additional reordering that cannot be performed by the compiler. The overall result is that the dynamic and static approaches are comparable in performance. When applied to a four-instruction-issue processor, both methods achieve more than two times speedup over a high performance single-instruction-issue processor. However, the performance of each scheme varies among the benchmark programs. To explain this variation, we have identi ed the conditions in these programs that make one approach perform better than the other.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115685804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Software pipelining: an evaluation of enhanced pipelining 软件流水线:对增强流水线的评价

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123481

Reese B. Jones, V. Allan

引用次数: 25

Strategies for branch target buffers 分支目标缓冲区的策略

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123473

B. Bray, M. Flynn

引用次数: 36

Efficient DAG construction and heuristic calculation for instruction scheduling 指令调度的高效DAG构造和启发式计算

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123482

M. Smotherman, Sanjay M. Krishnamurthy, P. Aravind, David Hunnicutt

引用次数: 42

Implementation optimization techniques for architecture synthesis of application-specific processors 特定应用程序处理器的体系结构综合的实现优化技术

MICRO 24 Pub Date : 1991-09-01 DOI: 10.1145/123465.123488

M. Breternitz, John Paul Shen

{"title":"Implementation optimization techniques for architecture synthesis of application-specific processors","authors":"M. Breternitz, John Paul Shen","doi":"10.1145/123465.123488","DOIUrl":"https://doi.org/10.1145/123465.123488","url":null,"abstract":"An architectu?’e synthesis method for the automated design of high-performance application-specific processors has been p?’oposed. This method divides the design task into the Specification Optimization (behavioml) and Implementation Optimization (structural) phases. In an eaI’iieT pUpeT[~], poweTfui algoTiihms foI’ peI’forming specification optimization aTe pTesented. High peTfow mance is achieved via exploitation of fine-groin parallelism. The architecture design style uses a iemplate resembling a scalable VeTy Long Instruction Word (VLIW) pTocessoT. This papeT pTesents new a~goTithms foT performing implementation optimization, which map the optimized specification in the form of highly paTallelized code to eficient haTdwaTe imp~ementations. A scalable implementation template is used to constTain the implementation style. Graph coloTing algorithms aTe employed to pToduce the optimized implementations. The entire architecture synthesis pToceduTe has been implemented and applied to numeTous examples. Results on these examples are presented. Speedups in the range of ,2.6 to 7.7 oveT contemporary RISC processors have been obtained. The computation times needed foT the synthesis of these examples are on the oTder of a few seconds.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114522718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3