P. Chang, William Y. Chen, S. Mahlke, Wen-mei W. Hwu
{"title":"比较多指令问题处理器的静态和动态代码调度","authors":"P. Chang, William Y. Chen, S. Mahlke, Wen-mei W. Hwu","doi":"10.1145/123465.123471","DOIUrl":null,"url":null,"abstract":"This paper examines two alternative approaches to supporting code scheduling for multiple-instruction-issue processors. One is to provide a set of non-trapping instructions so that the compiler can perform aggressive static code scheduling. The application of this approach to existing commercial architectures typically requires extending the instruction set. The other approach is to support out-of-order execution in the microarchitecture so that the hardware can perform aggressive dynamic code scheduling. This approach usually does not require modifying the instruction set but requires complex hardware support. In this paper, we analyze the performance of the two alternative approaches using a set of important nonnumerical C benchmark programs. A distinguishing feature of the experiment is that the code for the dynamic approach has been optimized and scheduled as much as allowed by the architecture. The hardware is only responsible for the additional reordering that cannot be performed by the compiler. The overall result is that the dynamic and static approaches are comparable in performance. When applied to a four-instruction-issue processor, both methods achieve more than two times speedup over a high performance single-instruction-issue processor. However, the performance of each scheme varies among the benchmark programs. To explain this variation, we have identi ed the conditions in these programs that make one approach perform better than the other.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"132 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Comparing static and dynamic code scheduling for multiple-instruction-issue processors\",\"authors\":\"P. Chang, William Y. Chen, S. Mahlke, Wen-mei W. Hwu\",\"doi\":\"10.1145/123465.123471\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper examines two alternative approaches to supporting code scheduling for multiple-instruction-issue processors. One is to provide a set of non-trapping instructions so that the compiler can perform aggressive static code scheduling. The application of this approach to existing commercial architectures typically requires extending the instruction set. The other approach is to support out-of-order execution in the microarchitecture so that the hardware can perform aggressive dynamic code scheduling. This approach usually does not require modifying the instruction set but requires complex hardware support. In this paper, we analyze the performance of the two alternative approaches using a set of important nonnumerical C benchmark programs. A distinguishing feature of the experiment is that the code for the dynamic approach has been optimized and scheduled as much as allowed by the architecture. The hardware is only responsible for the additional reordering that cannot be performed by the compiler. The overall result is that the dynamic and static approaches are comparable in performance. When applied to a four-instruction-issue processor, both methods achieve more than two times speedup over a high performance single-instruction-issue processor. However, the performance of each scheme varies among the benchmark programs. To explain this variation, we have identi ed the conditions in these programs that make one approach perform better than the other.\",\"PeriodicalId\":118572,\"journal\":{\"name\":\"MICRO 24\",\"volume\":\"132 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1991-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MICRO 24\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/123465.123471\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MICRO 24","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/123465.123471","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparing static and dynamic code scheduling for multiple-instruction-issue processors
This paper examines two alternative approaches to supporting code scheduling for multiple-instruction-issue processors. One is to provide a set of non-trapping instructions so that the compiler can perform aggressive static code scheduling. The application of this approach to existing commercial architectures typically requires extending the instruction set. The other approach is to support out-of-order execution in the microarchitecture so that the hardware can perform aggressive dynamic code scheduling. This approach usually does not require modifying the instruction set but requires complex hardware support. In this paper, we analyze the performance of the two alternative approaches using a set of important nonnumerical C benchmark programs. A distinguishing feature of the experiment is that the code for the dynamic approach has been optimized and scheduled as much as allowed by the architecture. The hardware is only responsible for the additional reordering that cannot be performed by the compiler. The overall result is that the dynamic and static approaches are comparable in performance. When applied to a four-instruction-issue processor, both methods achieve more than two times speedup over a high performance single-instruction-issue processor. However, the performance of each scheme varies among the benchmark programs. To explain this variation, we have identi ed the conditions in these programs that make one approach perform better than the other.