针对超标量和流水线处理器的编译器优化

2016 IEEE Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER) Pub Date : 2016-08-01 DOI:10.1109/DISCOVER.2016.7806224

Vishnu P. Bharadwaj, M. Rao

{"title":"针对超标量和流水线处理器的编译器优化","authors":"Vishnu P. Bharadwaj, M. Rao","doi":"10.1109/DISCOVER.2016.7806224","DOIUrl":null,"url":null,"abstract":"The exploitation of parallelism at both the multiprocessor or multicore level and at the instruction level is the means to achieve high-performance. The compiler for VLIW and superscalar processors must expose sufficient parallelism to effectively utilize the parallel hardware. The amount of instruction level parallelism available to VLIW processors or superscalar processors can be limited. This will limit the performance of these processors to a certain extent. However, with compiler optimization techniques, its performance can be increased to greater extent. This evaluation shows that utilizing the existing resources of the processor with certain programmer constraints and an efficient scheduling of independent and dependent blocks of instructions, we can increase the performance of the processors. As compiler optimization interact with the micro-architecture in complex ways, certain programmer constraints can be added to reduce the complexity and help the compiler to structure the Assembly code in a manner which can be used for out-of-order execution of the code. This paper provides new methods and improvements for the structure of the Assembly code for execution on superscalar processors.","PeriodicalId":383554,"journal":{"name":"2016 IEEE Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Compiler optimization for superscalar and pipelined processors\",\"authors\":\"Vishnu P. Bharadwaj, M. Rao\",\"doi\":\"10.1109/DISCOVER.2016.7806224\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The exploitation of parallelism at both the multiprocessor or multicore level and at the instruction level is the means to achieve high-performance. The compiler for VLIW and superscalar processors must expose sufficient parallelism to effectively utilize the parallel hardware. The amount of instruction level parallelism available to VLIW processors or superscalar processors can be limited. This will limit the performance of these processors to a certain extent. However, with compiler optimization techniques, its performance can be increased to greater extent. This evaluation shows that utilizing the existing resources of the processor with certain programmer constraints and an efficient scheduling of independent and dependent blocks of instructions, we can increase the performance of the processors. As compiler optimization interact with the micro-architecture in complex ways, certain programmer constraints can be added to reduce the complexity and help the compiler to structure the Assembly code in a manner which can be used for out-of-order execution of the code. This paper provides new methods and improvements for the structure of the Assembly code for execution on superscalar processors.\",\"PeriodicalId\":383554,\"journal\":{\"name\":\"2016 IEEE Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DISCOVER.2016.7806224\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DISCOVER.2016.7806224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在多处理器或多核级别和指令级别上利用并行性是实现高性能的手段。VLIW和超标量处理器的编译器必须暴露足够的并行性，以有效地利用并行硬件。可用于VLIW处理器或超标量处理器的指令级并行性的数量是有限的。这将在一定程度上限制这些处理器的性能。然而，通过编译器优化技术，它的性能可以得到更大程度的提高。这一评估表明，利用处理器的现有资源与一定的程序员约束和有效的调度独立和依赖的指令块，我们可以提高处理器的性能。当编译器优化以复杂的方式与微体系结构交互时，可以添加某些程序员约束来降低复杂性，并帮助编译器以一种可用于代码乱序执行的方式构建汇编代码。本文对在超标量处理器上执行的汇编代码的结构提出了新的方法和改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Compiler optimization for superscalar and pipelined processors

The exploitation of parallelism at both the multiprocessor or multicore level and at the instruction level is the means to achieve high-performance. The compiler for VLIW and superscalar processors must expose sufficient parallelism to effectively utilize the parallel hardware. The amount of instruction level parallelism available to VLIW processors or superscalar processors can be limited. This will limit the performance of these processors to a certain extent. However, with compiler optimization techniques, its performance can be increased to greater extent. This evaluation shows that utilizing the existing resources of the processor with certain programmer constraints and an efficient scheduling of independent and dependent blocks of instructions, we can increase the performance of the processors. As compiler optimization interact with the micro-architecture in complex ways, certain programmer constraints can be added to reduce the complexity and help the compiler to structure the Assembly code in a manner which can be used for out-of-order execution of the code. This paper provides new methods and improvements for the structure of the Assembly code for execution on superscalar processors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)

自引率

0.00%

发文量