H. Kasahara, M. Obata, K. Ishizaka, K. Kimura, H. Kaminaga, H. Nakano, Kouhei Nagasawa, Akiko Murai, H. Itagaki, J. Shirako
{"title":"日本千禧年计划IT21的多粒自动并行化。高级并行编译器","authors":"H. Kasahara, M. Obata, K. Ishizaka, K. Kimura, H. Kaminaga, H. Nakano, Kouhei Nagasawa, Akiko Murai, H. Itagaki, J. Shirako","doi":"10.1109/PCEE.2002.1115213","DOIUrl":null,"url":null,"abstract":"This paper describes OSCAR multigrain parallelizing compiler which has been developed in Japanese Millennium Project IT21 \"Advanced Parallelizing Compiler\" project and its performance on SMP machines. The compiler realizes multigrain parallelization for chip-multiprocessors to high-end servers. It hierarchically exploits coarse grain task parallelism among loops, subroutines and basic blocks and near fine grain parallelism among statements inside a basic block in addition to loop parallelism. Also, it globally optimizes cache use over different loops, or coarse grain tasks, based on the data localization technique to reduce memory access overhead. Current performance of OSCAR compiler for SPEC95fp is evaluated on different SMPs. For example, it gives us 3.7 times speedup for HYDRO2D, 1.8 times for SWIM, 1.7 times for SU2COR, 2.0 times for MGRID, 3.3 times for TURB3D on 8 processor IBM RS6000, against XL Fortran compiler ver 7.1 and 4.2 times speedup for SWIM and 2.2 times speedup for TURB3D on 4 processor Sun Ultra80 workstation against Forte6 update 2.","PeriodicalId":444003,"journal":{"name":"Proceedings. International Conference on Parallel Computing in Electrical Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Multigrain automatic parallelization in Japanese Millennium Project IT21. Advanced Parallelizing Compiler\",\"authors\":\"H. Kasahara, M. Obata, K. Ishizaka, K. Kimura, H. Kaminaga, H. Nakano, Kouhei Nagasawa, Akiko Murai, H. Itagaki, J. Shirako\",\"doi\":\"10.1109/PCEE.2002.1115213\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes OSCAR multigrain parallelizing compiler which has been developed in Japanese Millennium Project IT21 \\\"Advanced Parallelizing Compiler\\\" project and its performance on SMP machines. The compiler realizes multigrain parallelization for chip-multiprocessors to high-end servers. It hierarchically exploits coarse grain task parallelism among loops, subroutines and basic blocks and near fine grain parallelism among statements inside a basic block in addition to loop parallelism. Also, it globally optimizes cache use over different loops, or coarse grain tasks, based on the data localization technique to reduce memory access overhead. Current performance of OSCAR compiler for SPEC95fp is evaluated on different SMPs. For example, it gives us 3.7 times speedup for HYDRO2D, 1.8 times for SWIM, 1.7 times for SU2COR, 2.0 times for MGRID, 3.3 times for TURB3D on 8 processor IBM RS6000, against XL Fortran compiler ver 7.1 and 4.2 times speedup for SWIM and 2.2 times speedup for TURB3D on 4 processor Sun Ultra80 workstation against Forte6 update 2.\",\"PeriodicalId\":444003,\"journal\":{\"name\":\"Proceedings. International Conference on Parallel Computing in Electrical Engineering\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. International Conference on Parallel Computing in Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PCEE.2002.1115213\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Parallel Computing in Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PCEE.2002.1115213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multigrain automatic parallelization in Japanese Millennium Project IT21. Advanced Parallelizing Compiler
This paper describes OSCAR multigrain parallelizing compiler which has been developed in Japanese Millennium Project IT21 "Advanced Parallelizing Compiler" project and its performance on SMP machines. The compiler realizes multigrain parallelization for chip-multiprocessors to high-end servers. It hierarchically exploits coarse grain task parallelism among loops, subroutines and basic blocks and near fine grain parallelism among statements inside a basic block in addition to loop parallelism. Also, it globally optimizes cache use over different loops, or coarse grain tasks, based on the data localization technique to reduce memory access overhead. Current performance of OSCAR compiler for SPEC95fp is evaluated on different SMPs. For example, it gives us 3.7 times speedup for HYDRO2D, 1.8 times for SWIM, 1.7 times for SU2COR, 2.0 times for MGRID, 3.3 times for TURB3D on 8 processor IBM RS6000, against XL Fortran compiler ver 7.1 and 4.2 times speedup for SWIM and 2.2 times speedup for TURB3D on 4 processor Sun Ultra80 workstation against Forte6 update 2.