{"title":"节点共享策略在HPC批处理系统中的作用和收益","authors":"Alvaro Frank, Tim Süß, A. Brinkmann","doi":"10.1109/IPDPS.2019.00016","DOIUrl":null,"url":null,"abstract":"Processor manufacturers today scale performance by increasing the number of cores on each CPU. Unfortunately, not all HPC applications can efficiently saturate all cores of a single node, even if they successfully scale to thousands of nodes. For these applications, sharing nodes with other applications can help to stress different resources on the nodes to more efficiently use them. Previous work has shown that the performance impact of node sharing is very application dependent but very little work has studied its effects within batch systems and for complex parallel application mixes. Administrators therefore typically fear the complexity of running a batch system supporting node sharing and also fear that interference between co-allocated jobs in practice leads to worse performance. This paper focuses on sharing nodes by oversubscribing cores through hyper-threading. We introduce new node sharing strategies for batch systems by deriving extensions to the well-known backfill and first fit algorithms. These strategies have been implemented in the SLURM workload manager and the evaluation is based on NERSC Trinity scientific mini applications. The evaluation of our node sharing strategies shows no overhead when using co-allocation, but an increased computational efficiency of 19% and an increased scheduling efficiency of 25.2% compared to standard node allocation.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Effects and Benefits of Node Sharing Strategies in HPC Batch Systems\",\"authors\":\"Alvaro Frank, Tim Süß, A. Brinkmann\",\"doi\":\"10.1109/IPDPS.2019.00016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Processor manufacturers today scale performance by increasing the number of cores on each CPU. Unfortunately, not all HPC applications can efficiently saturate all cores of a single node, even if they successfully scale to thousands of nodes. For these applications, sharing nodes with other applications can help to stress different resources on the nodes to more efficiently use them. Previous work has shown that the performance impact of node sharing is very application dependent but very little work has studied its effects within batch systems and for complex parallel application mixes. Administrators therefore typically fear the complexity of running a batch system supporting node sharing and also fear that interference between co-allocated jobs in practice leads to worse performance. This paper focuses on sharing nodes by oversubscribing cores through hyper-threading. We introduce new node sharing strategies for batch systems by deriving extensions to the well-known backfill and first fit algorithms. These strategies have been implemented in the SLURM workload manager and the evaluation is based on NERSC Trinity scientific mini applications. The evaluation of our node sharing strategies shows no overhead when using co-allocation, but an increased computational efficiency of 19% and an increased scheduling efficiency of 25.2% compared to standard node allocation.\",\"PeriodicalId\":403406,\"journal\":{\"name\":\"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS.2019.00016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2019.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Effects and Benefits of Node Sharing Strategies in HPC Batch Systems
Processor manufacturers today scale performance by increasing the number of cores on each CPU. Unfortunately, not all HPC applications can efficiently saturate all cores of a single node, even if they successfully scale to thousands of nodes. For these applications, sharing nodes with other applications can help to stress different resources on the nodes to more efficiently use them. Previous work has shown that the performance impact of node sharing is very application dependent but very little work has studied its effects within batch systems and for complex parallel application mixes. Administrators therefore typically fear the complexity of running a batch system supporting node sharing and also fear that interference between co-allocated jobs in practice leads to worse performance. This paper focuses on sharing nodes by oversubscribing cores through hyper-threading. We introduce new node sharing strategies for batch systems by deriving extensions to the well-known backfill and first fit algorithms. These strategies have been implemented in the SLURM workload manager and the evaluation is based on NERSC Trinity scientific mini applications. The evaluation of our node sharing strategies shows no overhead when using co-allocation, but an increased computational efficiency of 19% and an increased scheduling efficiency of 25.2% compared to standard node allocation.