{"title":"Investigating Performance and Potential of the Parallel STL Using NAS Parallel Benchmark Kernels","authors":"Nicco Mietzsch, Karl Fuerlinger","doi":"10.1109/HPCS48598.2019.9188147","DOIUrl":null,"url":null,"abstract":"In recent years, multicore shared memory architectures have become more and more powerful. To effectively use such machines, many frameworks are available, including OpenMP and Intel threading building blocks (TBB). Since the 2017 version of its standard, C++ provides parallel algorithmic building blocks in the form of the Parallel Standard Template Library (pSTL). Unfortunately, compiler and runtime support for these new features improves slowly and few studies on the performance and potential of the pSTL are available.Our goal in this work is to evaluate the applicability of the Parallel STL in the context of scientific and technical parallel computing. To this end, we assess the performance of the pSTL using the NAS Parallel Benchmarks (NPB). Our study shows that, while there are algorithms which are difficult to implement using the pSTL, most kernels can easily be transformed into a pSTL version, with their performance approximately on par with other parallelization approaches.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS48598.2019.9188147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In recent years, multicore shared memory architectures have become more and more powerful. To effectively use such machines, many frameworks are available, including OpenMP and Intel threading building blocks (TBB). Since the 2017 version of its standard, C++ provides parallel algorithmic building blocks in the form of the Parallel Standard Template Library (pSTL). Unfortunately, compiler and runtime support for these new features improves slowly and few studies on the performance and potential of the pSTL are available.Our goal in this work is to evaluate the applicability of the Parallel STL in the context of scientific and technical parallel computing. To this end, we assess the performance of the pSTL using the NAS Parallel Benchmarks (NPB). Our study shows that, while there are algorithms which are difficult to implement using the pSTL, most kernels can easily be transformed into a pSTL version, with their performance approximately on par with other parallelization approaches.