{"title":"使用NAS并行基准核研究并行STL的性能和潜力","authors":"Nicco Mietzsch, Karl Fuerlinger","doi":"10.1109/HPCS48598.2019.9188147","DOIUrl":null,"url":null,"abstract":"In recent years, multicore shared memory architectures have become more and more powerful. To effectively use such machines, many frameworks are available, including OpenMP and Intel threading building blocks (TBB). Since the 2017 version of its standard, C++ provides parallel algorithmic building blocks in the form of the Parallel Standard Template Library (pSTL). Unfortunately, compiler and runtime support for these new features improves slowly and few studies on the performance and potential of the pSTL are available.Our goal in this work is to evaluate the applicability of the Parallel STL in the context of scientific and technical parallel computing. To this end, we assess the performance of the pSTL using the NAS Parallel Benchmarks (NPB). Our study shows that, while there are algorithms which are difficult to implement using the pSTL, most kernels can easily be transformed into a pSTL version, with their performance approximately on par with other parallelization approaches.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Investigating Performance and Potential of the Parallel STL Using NAS Parallel Benchmark Kernels\",\"authors\":\"Nicco Mietzsch, Karl Fuerlinger\",\"doi\":\"10.1109/HPCS48598.2019.9188147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, multicore shared memory architectures have become more and more powerful. To effectively use such machines, many frameworks are available, including OpenMP and Intel threading building blocks (TBB). Since the 2017 version of its standard, C++ provides parallel algorithmic building blocks in the form of the Parallel Standard Template Library (pSTL). Unfortunately, compiler and runtime support for these new features improves slowly and few studies on the performance and potential of the pSTL are available.Our goal in this work is to evaluate the applicability of the Parallel STL in the context of scientific and technical parallel computing. To this end, we assess the performance of the pSTL using the NAS Parallel Benchmarks (NPB). Our study shows that, while there are algorithms which are difficult to implement using the pSTL, most kernels can easily be transformed into a pSTL version, with their performance approximately on par with other parallelization approaches.\",\"PeriodicalId\":371856,\"journal\":{\"name\":\"2019 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCS48598.2019.9188147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS48598.2019.9188147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Investigating Performance and Potential of the Parallel STL Using NAS Parallel Benchmark Kernels
In recent years, multicore shared memory architectures have become more and more powerful. To effectively use such machines, many frameworks are available, including OpenMP and Intel threading building blocks (TBB). Since the 2017 version of its standard, C++ provides parallel algorithmic building blocks in the form of the Parallel Standard Template Library (pSTL). Unfortunately, compiler and runtime support for these new features improves slowly and few studies on the performance and potential of the pSTL are available.Our goal in this work is to evaluate the applicability of the Parallel STL in the context of scientific and technical parallel computing. To this end, we assess the performance of the pSTL using the NAS Parallel Benchmarks (NPB). Our study shows that, while there are algorithms which are difficult to implement using the pSTL, most kernels can easily be transformed into a pSTL version, with their performance approximately on par with other parallelization approaches.