Hyun-Gyu Kang, Raymond S Tuminaro, Andrey Prokopenko, Seth R Johnson, A. Salinger, Katherine J Evans
{"title":"使用现代 Fortran 求解器接口的 MPAS-ocean 隐式各向异性模式求解器","authors":"Hyun-Gyu Kang, Raymond S Tuminaro, Andrey Prokopenko, Seth R Johnson, A. Salinger, Katherine J Evans","doi":"10.1177/10943420231205601","DOIUrl":null,"url":null,"abstract":"We demonstrate use of a modern Fortran solver interface to manage solver algorithms for an implicit barotropic mode solver in the Model for Predictions Across Scales-Ocean (MPAS-O). ForTrilinos, a Fortran interface to Trilinos that contains a large collection of solver capabilities written in C++, has been implemented in MPAS-O to provide access to a suite of linear solver options. By virtue of the simplified wrapper and interface generator (SWIG) automation tool that generates modern Fortran interfaces to C++ code, we were able to implement the Fortran solver interface in MPAS-O using a familiar Fortran coding style while minimizing performance degradation. The ForTrilinos solver interface is written within MPAS-O’s time stepping modules as a subroutine in conjunction with MPAS-O code. Applied to an idealized ocean and a high-resolution realistic ocean test case, parallel performance of ForTrilinos solvers is examined. It is found that parallel scalability of the ForTrilinos solvers is highly dependent on the number of global synchronization points per solver iteration in each iterative solver algorithm. ForTrilinos solvers perform best compared to the Fortran hand-crafted (FHC) solver when the amount of work per processor is large enough. However, parallel scalability is better with the FHC solver and so when the work per core is modest FHC outperforms ForTrilinos. The intercomparison between the ForTrilinos and FHC solvers reveals that this performance hit in the ForTrilinos solver mostly comes from the global synchronization process, while suggesting that the matrix-vector multiplication process in the FHC solver needs to be optimized for better performance.","PeriodicalId":506320,"journal":{"name":"The International Journal of High Performance Computing Applications","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An implicit barotropic mode solver for MPAS-ocean using a modern Fortran solver interface\",\"authors\":\"Hyun-Gyu Kang, Raymond S Tuminaro, Andrey Prokopenko, Seth R Johnson, A. Salinger, Katherine J Evans\",\"doi\":\"10.1177/10943420231205601\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We demonstrate use of a modern Fortran solver interface to manage solver algorithms for an implicit barotropic mode solver in the Model for Predictions Across Scales-Ocean (MPAS-O). ForTrilinos, a Fortran interface to Trilinos that contains a large collection of solver capabilities written in C++, has been implemented in MPAS-O to provide access to a suite of linear solver options. By virtue of the simplified wrapper and interface generator (SWIG) automation tool that generates modern Fortran interfaces to C++ code, we were able to implement the Fortran solver interface in MPAS-O using a familiar Fortran coding style while minimizing performance degradation. The ForTrilinos solver interface is written within MPAS-O’s time stepping modules as a subroutine in conjunction with MPAS-O code. Applied to an idealized ocean and a high-resolution realistic ocean test case, parallel performance of ForTrilinos solvers is examined. It is found that parallel scalability of the ForTrilinos solvers is highly dependent on the number of global synchronization points per solver iteration in each iterative solver algorithm. ForTrilinos solvers perform best compared to the Fortran hand-crafted (FHC) solver when the amount of work per processor is large enough. However, parallel scalability is better with the FHC solver and so when the work per core is modest FHC outperforms ForTrilinos. The intercomparison between the ForTrilinos and FHC solvers reveals that this performance hit in the ForTrilinos solver mostly comes from the global synchronization process, while suggesting that the matrix-vector multiplication process in the FHC solver needs to be optimized for better performance.\",\"PeriodicalId\":506320,\"journal\":{\"name\":\"The International Journal of High Performance Computing Applications\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The International Journal of High Performance Computing Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1177/10943420231205601\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International Journal of High Performance Computing Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/10943420231205601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An implicit barotropic mode solver for MPAS-ocean using a modern Fortran solver interface
We demonstrate use of a modern Fortran solver interface to manage solver algorithms for an implicit barotropic mode solver in the Model for Predictions Across Scales-Ocean (MPAS-O). ForTrilinos, a Fortran interface to Trilinos that contains a large collection of solver capabilities written in C++, has been implemented in MPAS-O to provide access to a suite of linear solver options. By virtue of the simplified wrapper and interface generator (SWIG) automation tool that generates modern Fortran interfaces to C++ code, we were able to implement the Fortran solver interface in MPAS-O using a familiar Fortran coding style while minimizing performance degradation. The ForTrilinos solver interface is written within MPAS-O’s time stepping modules as a subroutine in conjunction with MPAS-O code. Applied to an idealized ocean and a high-resolution realistic ocean test case, parallel performance of ForTrilinos solvers is examined. It is found that parallel scalability of the ForTrilinos solvers is highly dependent on the number of global synchronization points per solver iteration in each iterative solver algorithm. ForTrilinos solvers perform best compared to the Fortran hand-crafted (FHC) solver when the amount of work per processor is large enough. However, parallel scalability is better with the FHC solver and so when the work per core is modest FHC outperforms ForTrilinos. The intercomparison between the ForTrilinos and FHC solvers reveals that this performance hit in the ForTrilinos solver mostly comes from the global synchronization process, while suggesting that the matrix-vector multiplication process in the FHC solver needs to be optimized for better performance.