Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques最新文献

筛选
英文 中文
Two techniques for static array partitioning on message-passing parallel machines 在消息传递并行机器上进行静态数组分区的两种技术
Eric Hung-Yu Tseng, J. Gaudiot
{"title":"Two techniques for static array partitioning on message-passing parallel machines","authors":"Eric Hung-Yu Tseng, J. Gaudiot","doi":"10.1109/PACT.1997.644018","DOIUrl":"https://doi.org/10.1109/PACT.1997.644018","url":null,"abstract":"We present two techniques for partitioning arrays in parallel DoAll loops for message-passing parallel machines. (1) Communication-free array partitioning: a general solution of communication-free partitioning is derived for arrays in a DoAll loop. The derivation is based on the Smith normal form decomposition of the matrix which characterizes the array references in a DoAll loop. (2) One block-communication partitioning: when communication-free partitioning is not possible, we derive the partitioning equations which allocate all remote data to a unique processor. Thus, at most one block-communication is required for each processor to obtain the remote data it needs during computation.","PeriodicalId":177411,"journal":{"name":"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123003930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A register pressure sensitive instruction scheduler for dynamic issue processors 动态问题处理器的寄存器压力敏感指令调度程序
Rad Silvera, Jian Wang, R. Govindarajan, G. Gao
{"title":"A register pressure sensitive instruction scheduler for dynamic issue processors","authors":"Rad Silvera, Jian Wang, R. Govindarajan, G. Gao","doi":"10.1109/PACT.1997.644005","DOIUrl":"https://doi.org/10.1109/PACT.1997.644005","url":null,"abstract":"Several modern superscalar processors contain an out-of-order (OOO) instruction issue mechanism, which resolves dependencies between instructions to expose greater instruction-level parallelism (ILP). How to extend a traditional instruction scheduler to take advantage of these hardware resources has presented both a challenge and an opportunity for compiler design. In this paper, we present a new approach for instruction scheduling, which reorders the instructions in a traditional instruction schedule to reduce its register pressure while maintaining the amount of ILP exploitable by the target OOO processor. This may prevent the introduction of spill code, thus producing a performance improvement. We have implemented our instruction scheduler under the MOST scheduling testbed. Our experiments show that the proposed approach reduces the register pressure by 12.81% in SPEC92 benchmark loops which do not require any spill code. For loops with a high register pressure, our approach reduced the amount of spill code required by an average of 32.08% and produced an average performance improvement of 8.79%.","PeriodicalId":177411,"journal":{"name":"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121760896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Path profile guided partial dead code elimination using predication 路径配置文件引导使用预测的部分死码消除
Rajiv Gupta, David A. Berson, J. Fang
{"title":"Path profile guided partial dead code elimination using predication","authors":"Rajiv Gupta, David A. Berson, J. Fang","doi":"10.1109/PACT.1997.644007","DOIUrl":"https://doi.org/10.1109/PACT.1997.644007","url":null,"abstract":"Presents a path-profile-guided partial dead code elimination algorithm that uses predication to enable sinking for the removal of deadness along frequently executed paths at the expense of adding additional instructions along infrequently executed paths. Our approach to optimization is particularly suitable for VLIW architectures since it directs the efforts of the optimizer towards aggressively enabling generation of fast schedules along frequently executed paths by reducing their critical path lengths. The paper presents a cost-benefit data flow analysis that uses path profiling information to determine the profitability of using predication-enabled sinking. The cost of predication-enabled sinking of a statement past a merge point is determined by identifying paths along which an additional statement is introduced. The benefit of predication-enabled sinking is determined by identifying paths along which additional dead code elimination is achieved due to predication. The results of this analysis are incorporated in a code sinking framework in which predication-enabled sinking is allowed past merge points only if its benefit is determined to be greater than the cost. It is also demonstrated that trade-off can be performed between the compile-time cost and the precision of cost-benefit analysis.","PeriodicalId":177411,"journal":{"name":"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129938609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
Determining the idle time of a tiling: new results 确定平铺的空闲时间:新结果
F. Desprez, J. Dongarra, F. Rastello, Y. Robert
{"title":"Determining the idle time of a tiling: new results","authors":"F. Desprez, J. Dongarra, F. Rastello, Y. Robert","doi":"10.1109/PACT.1997.644026","DOIUrl":"https://doi.org/10.1109/PACT.1997.644026","url":null,"abstract":"In the framework of fully permutable loops, tiling has been studied extensively as a source-to-source program transformation. We build upon recent results by Hogsted, Carter, and Ferrante (1997), who aim at determining the cumulated idle time spent by all processors while executing the partitioned (tiled) computation domain. We propose new, much shorter proofs of all their results and extend these in several important directions. More precisely, we provide an accurate solution for all values of the rise parameter that relates the shape of the iteration space to that of the tiles, and for all possible distributions of the tiles to processors. In contrast, the authors in Hogsted, Carter, and Ferrante (1997) deal only with a limited number of cases and provide upper bounds rather than exact formulas.","PeriodicalId":177411,"journal":{"name":"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130397600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Interprocedural distribution assignment placement: more than just enhancing intraprocedural placing techniques 程序间分配分配安置:不仅仅是加强程序内安置技术
J. Knoop, E. Mehofer
{"title":"Interprocedural distribution assignment placement: more than just enhancing intraprocedural placing techniques","authors":"J. Knoop, E. Mehofer","doi":"10.1109/PACT.1997.644001","DOIUrl":"https://doi.org/10.1109/PACT.1997.644001","url":null,"abstract":"Avoiding unnecessary remappings at run-time by means of a strategic distribution assignment placement (DAP) is a major means for improving the run-time efficiency of data-parallel programs on distributed-memory architectures. In Proc. Euro-Par '97, pp. 364-73 (1997), we presented a novel and aggressive intraprocedural algorithm achieving this by eliminating partially redundant and partially dead distribution assignments. In this paper, we show how to enhance this approach interprocedurally. Surprisingly at first sight, it turns out that a straightforward adaption of the intraprocedural approach fails because central properties being valid for the intraprocedural case do not carry over to the interprocedural one, revealing severe anomalies. After discussing the essential differences and analogies of DAP in the interprocedural and interprocedural cases, we show how to overcome these anomalies in order to arrive at a powerful and flexible approach for interprocedural DAP (IDAP). As in the interprocedural case, we get a hierarchy of IDAP algorithms of varying power and efficiency supporting user-customized solutions. First practical experiences underline its importance and effectivity.","PeriodicalId":177411,"journal":{"name":"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123877594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
The PROMIS compiler prototype PROMIS编译器原型
Carrie J. Brownhill, A. Nicolau, S. Novack, C. Polychronopoulos
{"title":"The PROMIS compiler prototype","authors":"Carrie J. Brownhill, A. Nicolau, S. Novack, C. Polychronopoulos","doi":"10.1109/PACT.1997.644008","DOIUrl":"https://doi.org/10.1109/PACT.1997.644008","url":null,"abstract":"Source-code parallelizers and instruction-level parallelizers each have specific advantages. Usually, a compiler is designed to be one or the other, based on the target architecture and/or algorithms. A compiler that is designed to generate near-optimal code for modern, multi-level machines must have the capabilities of both. This paper describes the prototype of the PROMIS compiler. The prototype was designed to show that loop-level and instruction-level parallelization can be combined to produce results better than either one alone. In addition, it shows how communication between the levels can produce additional speedup.","PeriodicalId":177411,"journal":{"name":"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129010503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Static locality analysis for cache management 用于缓存管理的静态局部性分析
F. Sánchez, Antonio González, M. Valero
{"title":"Static locality analysis for cache management","authors":"F. Sánchez, Antonio González, M. Valero","doi":"10.1109/PACT.1997.644022","DOIUrl":"https://doi.org/10.1109/PACT.1997.644022","url":null,"abstract":"Most memory references in numerical codes correspond to array references whose indices are affine functions of surrounding loop indices. These array references follow a regular predictable memory pattern that can be analysed at compile time. This analysis can provide valuable information like the locality exhibited by the program, which can be used to implement more intelligent caching strategy. In this paper we propose a static locality analysis oriented to the management of data caches. We show that previous proposals on locality analysis are not appropriate when the proposals have a high conflict miss ratio. This paper examines those proposals by introducing a compile-time interference analysis that significantly improve the performance of them. We first show how this analysis can be used to characterize the dynamic locality properties of numerical codes. This evaluation show for instance that a large percentage of references exhibit any type of locality. This motivates the use of a dual data cache, which has a module specialized to exploit temporal locality, and a selective cache respectively. Then, the performance provided by these two cache organizations is evaluated. In both organizations, the static locality analysis is responsible for tagging each memory instruction accordingly to the particular type(s) of locality that it exhibits.","PeriodicalId":177411,"journal":{"name":"Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127333874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信