在大型遗留代码中间接采用性能可移植性层的方法

2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC) Pub Date : 2019-11-01 DOI:10.1109/P3HPC49587.2019.00009

John K. Holmen, B. Peterson, M. Berzins

{"title":"在大型遗留代码中间接采用性能可移植性层的方法","authors":"John K. Holmen, B. Peterson, M. Berzins","doi":"10.1109/P3HPC49587.2019.00009","DOIUrl":null,"url":null,"abstract":"Diversity among supported architectures in current and emerging high performance computing systems, including those for exascale, makes portable codebases desirable. Portabil- ity of a codebase can be improved using a performance portability layer to provide access to multiple underlying programming mod- els through a single interface. Direct adoption of a performance portability layer, however, poses challenges for large pre-existing software frameworks that may need to preserve legacy code and/or adopt other programming models in the future. This paper describes an approach for indirect adoption that introduces a framework-specific portability layer between the application developer and the adopted performance portability layer to help improve legacy code support and long-term portability for future architectures and programming models. This intermediate layer uses loop-level, application-level, and build-level components to ease adoption of a performance portability layer in large legacy codebases. Results are shown for two challenging case studies using this approach to make portable use of OpenMP and CUDA via Kokkos in an asynchronous many-task runtime system, Uintah. These results show performance improvements up to 2.7x when refactoring for portability and 2.6x when more efficiently using a node. Good strong-scaling to 442,368 threads across 1,728 Knights Landing processors are also shown using MPI+Kokkos at scale.","PeriodicalId":377385,"journal":{"name":"2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"An Approach for Indirectly Adopting a Performance Portability Layer in Large Legacy Codes\",\"authors\":\"John K. Holmen, B. Peterson, M. Berzins\",\"doi\":\"10.1109/P3HPC49587.2019.00009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diversity among supported architectures in current and emerging high performance computing systems, including those for exascale, makes portable codebases desirable. Portabil- ity of a codebase can be improved using a performance portability layer to provide access to multiple underlying programming mod- els through a single interface. Direct adoption of a performance portability layer, however, poses challenges for large pre-existing software frameworks that may need to preserve legacy code and/or adopt other programming models in the future. This paper describes an approach for indirect adoption that introduces a framework-specific portability layer between the application developer and the adopted performance portability layer to help improve legacy code support and long-term portability for future architectures and programming models. This intermediate layer uses loop-level, application-level, and build-level components to ease adoption of a performance portability layer in large legacy codebases. Results are shown for two challenging case studies using this approach to make portable use of OpenMP and CUDA via Kokkos in an asynchronous many-task runtime system, Uintah. These results show performance improvements up to 2.7x when refactoring for portability and 2.6x when more efficiently using a node. Good strong-scaling to 442,368 threads across 1,728 Knights Landing processors are also shown using MPI+Kokkos at scale.\",\"PeriodicalId\":377385,\"journal\":{\"name\":\"2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/P3HPC49587.2019.00009\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/P3HPC49587.2019.00009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

在当前和新兴的高性能计算系统(包括exascale系统)中支持的体系结构之间存在多样性，这使得可移植代码库成为一种需要。代码库的可移植性可以通过使用性能可移植性层来通过单个接口提供对多个底层编程模型的访问来提高。然而，直接采用性能可移植性层对大型预先存在的软件框架提出了挑战，这些框架可能需要保留遗留代码和/或在将来采用其他编程模型。本文描述了一种间接采用的方法，该方法在应用程序开发人员和所采用的性能可移植性层之间引入了一个特定于框架的可移植性层，以帮助改进遗留代码支持和未来架构和编程模型的长期可移植性。这个中间层使用循环级、应用程序级和构建级组件来简化在大型遗留代码库中采用性能可移植性层。结果显示了两个具有挑战性的案例研究，使用这种方法通过Kokkos在异步多任务运行时系统intah中移植使用OpenMP和CUDA。这些结果表明，在为可移植性而重构时，性能提高了2.7倍，在更有效地使用节点时，性能提高了2.6倍。在大规模使用MPI+Kokkos的情况下，还显示了在1,728个Knights Landing处理器上扩展到442,368个线程的良好强伸缩性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Approach for Indirectly Adopting a Performance Portability Layer in Large Legacy Codes

Diversity among supported architectures in current and emerging high performance computing systems, including those for exascale, makes portable codebases desirable. Portabil- ity of a codebase can be improved using a performance portability layer to provide access to multiple underlying programming mod- els through a single interface. Direct adoption of a performance portability layer, however, poses challenges for large pre-existing software frameworks that may need to preserve legacy code and/or adopt other programming models in the future. This paper describes an approach for indirect adoption that introduces a framework-specific portability layer between the application developer and the adopted performance portability layer to help improve legacy code support and long-term portability for future architectures and programming models. This intermediate layer uses loop-level, application-level, and build-level components to ease adoption of a performance portability layer in large legacy codebases. Results are shown for two challenging case studies using this approach to make portable use of OpenMP and CUDA via Kokkos in an asynchronous many-task runtime system, Uintah. These results show performance improvements up to 2.7x when refactoring for portability and 2.6x when more efficiently using a node. Good strong-scaling to 442,368 threads across 1,728 Knights Landing processors are also shown using MPI+Kokkos at scale.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)

自引率

0.00%

发文量