{"title":"Scalasca支持基于Intel Xeon Phi的大规模HPC系统上的MPI+OpenMP并行应用","authors":"B. Wylie, W. Frings","doi":"10.1145/2484762.2484777","DOIUrl":null,"url":null,"abstract":"Intel Xeon Phi coprocessors based on the Many Integrated Core (MIC) architecture are starting to appear in HPC systems, with Stampede being a prominent example available within the XSEDE cyber-infrastructure. Porting MPI and OpenMP applications to such systems is often no more than simple recompilation, however, execution performance needs to be carefully analyzed and tuned to effectively exploit their unique capabilities. For performance measurement and analysis tools, the variety of execution modes need to be supported in a consistent and convenient manner, and especially execution configurations involving large numbers of compute nodes each with several multicore host processors and many-core coprocessors. Early experience using the open-source Scalasca toolset for runtime summarization and automatic trace analysis with the NPB BT-MZ MPI+OpenMP parallel application on Stampede is reported, along with discussion of on-going and future work.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"31 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Scalasca support for MPI+OpenMP parallel applications on large-scale HPC systems based on Intel Xeon Phi\",\"authors\":\"B. Wylie, W. Frings\",\"doi\":\"10.1145/2484762.2484777\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Intel Xeon Phi coprocessors based on the Many Integrated Core (MIC) architecture are starting to appear in HPC systems, with Stampede being a prominent example available within the XSEDE cyber-infrastructure. Porting MPI and OpenMP applications to such systems is often no more than simple recompilation, however, execution performance needs to be carefully analyzed and tuned to effectively exploit their unique capabilities. For performance measurement and analysis tools, the variety of execution modes need to be supported in a consistent and convenient manner, and especially execution configurations involving large numbers of compute nodes each with several multicore host processors and many-core coprocessors. Early experience using the open-source Scalasca toolset for runtime summarization and automatic trace analysis with the NPB BT-MZ MPI+OpenMP parallel application on Stampede is reported, along with discussion of on-going and future work.\",\"PeriodicalId\":426819,\"journal\":{\"name\":\"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery\",\"volume\":\"31 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2484762.2484777\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2484762.2484777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scalasca support for MPI+OpenMP parallel applications on large-scale HPC systems based on Intel Xeon Phi
Intel Xeon Phi coprocessors based on the Many Integrated Core (MIC) architecture are starting to appear in HPC systems, with Stampede being a prominent example available within the XSEDE cyber-infrastructure. Porting MPI and OpenMP applications to such systems is often no more than simple recompilation, however, execution performance needs to be carefully analyzed and tuned to effectively exploit their unique capabilities. For performance measurement and analysis tools, the variety of execution modes need to be supported in a consistent and convenient manner, and especially execution configurations involving large numbers of compute nodes each with several multicore host processors and many-core coprocessors. Early experience using the open-source Scalasca toolset for runtime summarization and automatic trace analysis with the NPB BT-MZ MPI+OpenMP parallel application on Stampede is reported, along with discussion of on-going and future work.