D. Hisley, G. Agrawal, P. Satya-narayana, L. Pollock
{"title":"使用OpenMP进行不规则代码的移植和性能评估","authors":"D. Hisley, G. Agrawal, P. Satya-narayana, L. Pollock","doi":"10.1002/1096-9128(200010)12:12%3C1241::AID-CPE523%3E3.0.CO;2-D","DOIUrl":null,"url":null,"abstract":"In the last two years, OpenMP has been gaining popularity as a standard for developing portable shared memory parallel programs. With the improvements in centralized shared memory technologies and the emergence of distributed shared memory (DSM) architectures, several medium-to-large physical and logical shared memory configurations are now available. Thus, OpenMP stands to be a promising medium for developing scalable and portable parallel programs. \n \nIn this paper, we focus on evaluating the suitability of OpenMP for developing scalable and portable irregular applications. We examine the programming paradigms supported by OpenMP that are suitable for this important class of applications, the performance and scalability achieved with these applications, the achieved locality and uniprocessor cache performance and the factors behind imperfect scalability. We have used two irregular applications and one NAS irregular code as the benchmarks for our study. Our experiments have been conducted on a 64-processor SGI Origin 2000. \n \nOur experiments show that reasonably good scalability is possible using OpenMP if careful attention is paid to locality and load balancing issues. Particularly, using the Single Program Multiple Data (SPMD) paradigm for programming is a significant win over just using loop parallelization directives. As expected, the cost of remote accesses is the major factor behind imperfect speedups of SPMD OpenMP programs. Copyright © 2000 John Wiley & Sons, Ltd.","PeriodicalId":199059,"journal":{"name":"Concurr. Pract. Exp.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Porting and performance evaluation of irregular codes using OpenMP\",\"authors\":\"D. Hisley, G. Agrawal, P. Satya-narayana, L. Pollock\",\"doi\":\"10.1002/1096-9128(200010)12:12%3C1241::AID-CPE523%3E3.0.CO;2-D\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the last two years, OpenMP has been gaining popularity as a standard for developing portable shared memory parallel programs. With the improvements in centralized shared memory technologies and the emergence of distributed shared memory (DSM) architectures, several medium-to-large physical and logical shared memory configurations are now available. Thus, OpenMP stands to be a promising medium for developing scalable and portable parallel programs. \\n \\nIn this paper, we focus on evaluating the suitability of OpenMP for developing scalable and portable irregular applications. We examine the programming paradigms supported by OpenMP that are suitable for this important class of applications, the performance and scalability achieved with these applications, the achieved locality and uniprocessor cache performance and the factors behind imperfect scalability. We have used two irregular applications and one NAS irregular code as the benchmarks for our study. Our experiments have been conducted on a 64-processor SGI Origin 2000. \\n \\nOur experiments show that reasonably good scalability is possible using OpenMP if careful attention is paid to locality and load balancing issues. Particularly, using the Single Program Multiple Data (SPMD) paradigm for programming is a significant win over just using loop parallelization directives. As expected, the cost of remote accesses is the major factor behind imperfect speedups of SPMD OpenMP programs. Copyright © 2000 John Wiley & Sons, Ltd.\",\"PeriodicalId\":199059,\"journal\":{\"name\":\"Concurr. Pract. Exp.\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Concurr. Pract. Exp.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/1096-9128(200010)12:12%3C1241::AID-CPE523%3E3.0.CO;2-D\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurr. Pract. Exp.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/1096-9128(200010)12:12%3C1241::AID-CPE523%3E3.0.CO;2-D","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Porting and performance evaluation of irregular codes using OpenMP
In the last two years, OpenMP has been gaining popularity as a standard for developing portable shared memory parallel programs. With the improvements in centralized shared memory technologies and the emergence of distributed shared memory (DSM) architectures, several medium-to-large physical and logical shared memory configurations are now available. Thus, OpenMP stands to be a promising medium for developing scalable and portable parallel programs.
In this paper, we focus on evaluating the suitability of OpenMP for developing scalable and portable irregular applications. We examine the programming paradigms supported by OpenMP that are suitable for this important class of applications, the performance and scalability achieved with these applications, the achieved locality and uniprocessor cache performance and the factors behind imperfect scalability. We have used two irregular applications and one NAS irregular code as the benchmarks for our study. Our experiments have been conducted on a 64-processor SGI Origin 2000.
Our experiments show that reasonably good scalability is possible using OpenMP if careful attention is paid to locality and load balancing issues. Particularly, using the Single Program Multiple Data (SPMD) paradigm for programming is a significant win over just using loop parallelization directives. As expected, the cost of remote accesses is the major factor behind imperfect speedups of SPMD OpenMP programs. Copyright © 2000 John Wiley & Sons, Ltd.