International Symposium on Code Generation and Optimization, 2003. CGO 2003.最新文献

Coupling on-line and off-line profile information to improve program performance 耦合在线和离线概要信息以提高程序性能

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191534

C. Krintz

引用次数: 52

An infrastructure for adaptive dynamic optimization 一种自适应动态优化的基础结构

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191551

Derek Bruening, Timothy Garnett, Saman P. Amarasinghe

{"title":"An infrastructure for adaptive dynamic optimization","authors":"Derek Bruening, Timothy Garnett, Saman P. Amarasinghe","doi":"10.1109/CGO.2003.1191551","DOIUrl":"https://doi.org/10.1109/CGO.2003.1191551","url":null,"abstract":"Dynamic optimization is emerging as a promising approach to overcome many of the obstacles of traditional static compilation. But while there are a number of compiler infrastructures for developing static optimizations, there are very few for developing dynamic optimizations. We present a framework for implementing dynamic analyses and optimizations. We provide an interface for building external modules, or clients, for the DynamoRIO dynamic code modification system. This interface abstracts away many low-level details of the DynamoRIO runtime system while exposing a simple and powerful, yet efficient and lightweight API. This is achieved by restricting optimization units to linear streams of code and using adaptive levels of detail for representing instructions. The interface is not restricted to optimization and can be used for instrumentation, profiling, dynamic translation, etc. To demonstrate the usefulness and effectiveness of our framework, we implemented several optimizations. These improve the performance of some applications by as much as 40% relative to native execution. The average speedup relative to base DynamoRIO performance is 12%.","PeriodicalId":277590,"journal":{"name":"International Symposium on Code Generation and Optimization, 2003. CGO 2003.","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126599089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 571

Improving quasi-dynamic schedules through region slip 通过区域滑移改进准动态调度

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191541

Francesco Spadini, Brian Fahs, Sanjay J. Patel, S. Lumetta

引用次数: 10

Compiler optimization-space exploration 编译器优化-空间探索

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191546

Spyridon Triantafyllis, Manish Vachharajani, David I. August

{"title":"Compiler optimization-space exploration","authors":"Spyridon Triantafyllis, Manish Vachharajani, David I. August","doi":"10.1109/CGO.2003.1191546","DOIUrl":"https://doi.org/10.1109/CGO.2003.1191546","url":null,"abstract":"To meet the demands of modern architectures, optimizing compilers must incorporate an ever larger number of increasingly complex transformation algorithms. Since code transformations may often degrade performance or interfere with subsequent transformations, compilers employ predictive heuristics to guide optimizations by predicting their effects a priori. Unfortunately, the unpredictability of optimization interaction and the irregularity of today's wide-issue machines severely limit the accuracy of these heuristics. As a result, compiler writers may temper high variance optimizations with overly conservative heuristics or may exclude these optimizations entirely. While this process results in a compiler capable of generating good average code quality across the target benchmark set, it is at the cost of missed optimization opportunities in individual code segments. To replace predictive heuristics, researchers have proposed compilers which explore many optimization options, selecting the best one a posteriori. Unfortunately, these existing iterative compilation techniques are not practical for reasons of compile time and applicability. We present the Optimization-Space Exploration (OSE) compiler organization, the first practical iterative compilation strategy applicable to optimizations in general-purpose compilers. Instead of replacing predictive heuristics, OSE uses the compiler writer's knowledge encoded in the heuristics to select a small number of promising optimization alternatives for a given code segment. Compile time is limited by evaluating only these alternatives for hot code segments using a general compile-time performance estimator An OSE-enhanced version of Intel's highly-tuned, aggressively optimizing production compiler for IA-64 yields a significant performance improvement, more than 20% in some cases, on Itanium for SPEC codes.","PeriodicalId":277590,"journal":{"name":"International Symposium on Code Generation and Optimization, 2003. CGO 2003.","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132469846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 269

Speculative register promotion using advanced load address table (ALAT) 使用高级加载地址表(ALAT)的推测寄存器提升

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191539

Jin Lin, Tong Chen, W. Hsu, P. Yew

引用次数: 30

The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges Transmeta代码变形/spl交易/软件:使用推测、恢复和自适应重新翻译来解决现实生活中的挑战

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191529

James C. Dehnert, Brian K. Grant, John Banning, Richard Johnson, Thomas Kistler, Alexander Klaiber, Jim Mattson

{"title":"The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges","authors":"James C. Dehnert, Brian K. Grant, John Banning, Richard Johnson, Thomas Kistler, Alexander Klaiber, Jim Mattson","doi":"10.1109/CGO.2003.1191529","DOIUrl":"https://doi.org/10.1109/CGO.2003.1191529","url":null,"abstract":"Transmeta's Crusoe microprocessor is a full, system-level implementation of the x86 architecture, comprising a native VLIW microprocessor with a software layer, the Code Morphing Software (CMS), that combines an interpreter, dynamic binary translator, optimizer, and run-time system. In its general structure, CMS resembles other binary translation systems described in the literature, but it is unique in several respects. The wide range of PC workloads that CMS must handle gracefully in real-life operation, plus the need for full system-level x86 compatibility, expose several issues that have received little or no attention in previous literature, such as exceptions and interrupts, I/O, DMA, and self-modifying code. In this paper we discuss some of the challenges raised by these issues, and present the techniques developed in Crusoe and CMS to meet those challenges. The key to these solutions is the Crusoe paradigm of aggressive speculation, recovery to a consistent x86 state using unique hardware commit-and-rollback support, and adaptive retranslation when exceptions occur too often to be handled efficiently by interpretation.","PeriodicalId":277590,"journal":{"name":"International Symposium on Code Generation and Optimization, 2003. CGO 2003.","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114142362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 306

Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack Intel/spl reg/ Itanium/spl reg/架构寄存器栈的优化

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191538

A. Settle, D. Connors, Gerolf Hoflehner, Daniel M. Lavery

{"title":"Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack","authors":"A. Settle, D. Connors, Gerolf Hoflehner, Daniel M. Lavery","doi":"10.1109/CGO.2003.1191538","DOIUrl":"https://doi.org/10.1109/CGO.2003.1191538","url":null,"abstract":"The Intel/spl reg/ Itanium/spl reg/ architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium/spl reg/ architecture provides a compiler-controllable virtual register stack to reduce the penalty of memory accesses associated with procedure calls. The Itanium/spl reg/ Register Stack Engine (RSE) transparently manages the register stack and saves and restores physical registers to and from memory as needed. Existing code generation techniques for the register stack aggressively allocate virtual registers without regard to the register pressure on different control-flow paths. As such, applications with large data sets may stress the RSE, and cause substantial execution delays due to the high number of register saves and restores. Since the Itanium/spl reg/ architecture is developed around Explicitly Parallel Instruction Computing (EPIC) concepts, solutions to increasing the register stack efficiency favor code generation techniques rather than hardware approaches.","PeriodicalId":277590,"journal":{"name":"International Symposium on Code Generation and Optimization, 2003. CGO 2003.","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121862775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Adaptive online context-sensitive inlining 自适应在线上下文敏感内联

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191550

K. Hazelwood, D. Grove

{"title":"Adaptive online context-sensitive inlining","authors":"K. Hazelwood, D. Grove","doi":"10.1109/CGO.2003.1191550","DOIUrl":"https://doi.org/10.1109/CGO.2003.1191550","url":null,"abstract":"As current trends in software development move toward more complex object-oriented programming, inlining has become a vital optimization that provides substantial performance improvements to C++ and Java programs. Yet, the aggressiveness of the inlining algorithm must be carefully monitored to effectively balance performance and code size. The state-of-the-art is to use profile information (associated with call edges) to guide inlining decisions. In the presence of virtual method calls, profile information for one call edge may not be sufficient for making effectual inlining decisions. Therefore, we explore the use of profiling data with additional levels of context sensitivity. In addition to exploring fixed levels of context sensitivity, we explore several adaptive schemes that attempt to find the ideal degree of context sensitivity for each call site. Our techniques are evaluated on the basis of runtime performance, code size and dynamic compilation time. On average, we found that with minimal impact on performance (+/-1%) context sensitivity can enable 10% reductions in compiled code space and compile time. Performance on individual programs varied from -4.2% to 5.3% while reductions in compile time and code space of up to 33.0% and 56.7% respectively were obtained.","PeriodicalId":277590,"journal":{"name":"International Symposium on Code Generation and Optimization, 2003. CGO 2003.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123793465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

Integrated prepass scheduling for a Java just-in-time compiler on the IA-64 IA-64上Java即时编译器的集成预传调度

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191542

T. Inagaki, H. Komatsu, T. Nakatani

引用次数: 10

TEST: a Tracer for Extracting Speculative Threads 用于提取推测线程的跟踪程序

International Symposium on Code Generation and Optimization, 2003. CGO 2003. Pub Date : 2003-03-23 DOI: 10.1109/CGO.2003.1191554

M. Chen, K. Olukotun

{"title":"TEST: a Tracer for Extracting Speculative Threads","authors":"M. Chen, K. Olukotun","doi":"10.1109/CGO.2003.1191554","DOIUrl":"https://doi.org/10.1109/CGO.2003.1191554","url":null,"abstract":"Thread-level speculation (TLS) allows sequential programs to be arbitrarily decomposed into threads that can be safely executed in parallel. A key challenge for TLS processors is choosing thread decompositions that speedup the program. Current techniques for identifying decompositions have practical limitations in real systems. Traditional parallelizing compilers do not work effectively on most integer programs, and software profiling slows down program execution too much for real-time analysis. Tracer for Extracting Speculative Threads (TEST) is hardware support that analyzes sequential program execution to estimate performance of possible thread decompositions. This hardware is used in a dynamic parallelization system that automatically transforms unmodified, sequential Java programs to run on TLS processors. In this system, the best thread decompositions found by TEST are dynamically recompiled to run speculatively. The paper describes the analysis performed by TEST and presents simulation results demonstrating its effectiveness on real programs. Estimates are also provided that show the tracer requires minimal hardware additions to our speculative chip-multiprocessor (< 1% of the total transistor count) and causes only minor slowdowns to programs during analysis (3-25%).","PeriodicalId":277590,"journal":{"name":"International Symposium on Code Generation and Optimization, 2003. CGO 2003.","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125684835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54