日本千禧年计划IT21的多粒自动并行化。高级并行编译器

Proceedings. International Conference on Parallel Computing in Electrical Engineering Pub Date : 2002-09-22 DOI:10.1109/PCEE.2002.1115213

H. Kasahara, M. Obata, K. Ishizaka, K. Kimura, H. Kaminaga, H. Nakano, Kouhei Nagasawa, Akiko Murai, H. Itagaki, J. Shirako

{"title":"日本千禧年计划IT21的多粒自动并行化。高级并行编译器","authors":"H. Kasahara, M. Obata, K. Ishizaka, K. Kimura, H. Kaminaga, H. Nakano, Kouhei Nagasawa, Akiko Murai, H. Itagaki, J. Shirako","doi":"10.1109/PCEE.2002.1115213","DOIUrl":null,"url":null,"abstract":"This paper describes OSCAR multigrain parallelizing compiler which has been developed in Japanese Millennium Project IT21 \"Advanced Parallelizing Compiler\" project and its performance on SMP machines. The compiler realizes multigrain parallelization for chip-multiprocessors to high-end servers. It hierarchically exploits coarse grain task parallelism among loops, subroutines and basic blocks and near fine grain parallelism among statements inside a basic block in addition to loop parallelism. Also, it globally optimizes cache use over different loops, or coarse grain tasks, based on the data localization technique to reduce memory access overhead. Current performance of OSCAR compiler for SPEC95fp is evaluated on different SMPs. For example, it gives us 3.7 times speedup for HYDRO2D, 1.8 times for SWIM, 1.7 times for SU2COR, 2.0 times for MGRID, 3.3 times for TURB3D on 8 processor IBM RS6000, against XL Fortran compiler ver 7.1 and 4.2 times speedup for SWIM and 2.2 times speedup for TURB3D on 4 processor Sun Ultra80 workstation against Forte6 update 2.","PeriodicalId":444003,"journal":{"name":"Proceedings. International Conference on Parallel Computing in Electrical Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Multigrain automatic parallelization in Japanese Millennium Project IT21. Advanced Parallelizing Compiler\",\"authors\":\"H. Kasahara, M. Obata, K. Ishizaka, K. Kimura, H. Kaminaga, H. Nakano, Kouhei Nagasawa, Akiko Murai, H. Itagaki, J. Shirako\",\"doi\":\"10.1109/PCEE.2002.1115213\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes OSCAR multigrain parallelizing compiler which has been developed in Japanese Millennium Project IT21 \\\"Advanced Parallelizing Compiler\\\" project and its performance on SMP machines. The compiler realizes multigrain parallelization for chip-multiprocessors to high-end servers. It hierarchically exploits coarse grain task parallelism among loops, subroutines and basic blocks and near fine grain parallelism among statements inside a basic block in addition to loop parallelism. Also, it globally optimizes cache use over different loops, or coarse grain tasks, based on the data localization technique to reduce memory access overhead. Current performance of OSCAR compiler for SPEC95fp is evaluated on different SMPs. For example, it gives us 3.7 times speedup for HYDRO2D, 1.8 times for SWIM, 1.7 times for SU2COR, 2.0 times for MGRID, 3.3 times for TURB3D on 8 processor IBM RS6000, against XL Fortran compiler ver 7.1 and 4.2 times speedup for SWIM and 2.2 times speedup for TURB3D on 4 processor Sun Ultra80 workstation against Forte6 update 2.\",\"PeriodicalId\":444003,\"journal\":{\"name\":\"Proceedings. International Conference on Parallel Computing in Electrical Engineering\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. International Conference on Parallel Computing in Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PCEE.2002.1115213\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Parallel Computing in Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PCEE.2002.1115213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

本文介绍了日本千禧年计划IT21“高级并行编译器”项目中开发的OSCAR多粒并行编译器及其在SMP机上的性能。该编译器实现了芯片多处理器到高端服务器的多粒并行化。它分层地利用循环、子程序和基本块之间的粗粒度任务并行性，以及基本块内语句之间的近细粒度并行性，以及循环并行性。此外，它还基于数据本地化技术对不同循环或粗粒度任务的缓存使用进行全局优化，以减少内存访问开销。在不同的smp上评估了SPEC95fp的OSCAR编译器的当前性能。例如，它为HYDRO2D提供了3.7倍的加速，为SWIM提供了1.8倍的加速，为SU2COR提供了1.7倍的加速，为MGRID提供了2.0倍的加速，为8处理器IBM RS6000上的TURB3D提供了3.3倍的加速，针对XL Fortran编译器7.1版本，为SWIM提供了4.2倍的加速，为4处理器Sun Ultra80工作站上的TURB3D提供了2.2倍的加速，针对Forte6更新2。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multigrain automatic parallelization in Japanese Millennium Project IT21. Advanced Parallelizing Compiler

This paper describes OSCAR multigrain parallelizing compiler which has been developed in Japanese Millennium Project IT21 "Advanced Parallelizing Compiler" project and its performance on SMP machines. The compiler realizes multigrain parallelization for chip-multiprocessors to high-end servers. It hierarchically exploits coarse grain task parallelism among loops, subroutines and basic blocks and near fine grain parallelism among statements inside a basic block in addition to loop parallelism. Also, it globally optimizes cache use over different loops, or coarse grain tasks, based on the data localization technique to reduce memory access overhead. Current performance of OSCAR compiler for SPEC95fp is evaluated on different SMPs. For example, it gives us 3.7 times speedup for HYDRO2D, 1.8 times for SWIM, 1.7 times for SU2COR, 2.0 times for MGRID, 3.3 times for TURB3D on 8 processor IBM RS6000, against XL Fortran compiler ver 7.1 and 4.2 times speedup for SWIM and 2.2 times speedup for TURB3D on 4 processor Sun Ultra80 workstation against Forte6 update 2.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings. International Conference on Parallel Computing in Electrical Engineering

自引率

0.00%

发文量