Zhida An, Ding Li, Yao Guo, Guijin Gao, Yuxin Ren, Ning Jia, Xinwei Hu
{"title":"面向操作系统异构的高性能计算集群管理","authors":"Zhida An, Ding Li, Yao Guo, Guijin Gao, Yuxin Ren, Ning Jia, Xinwei Hu","doi":"10.1145/3609510.3609819","DOIUrl":null,"url":null,"abstract":"To achieve extremely high performance in HPC, many researchers have proposed customized operating systems that are tailored to HPC workload characteristics and emerging hardware. Hence, we argue that the HPC cluster will move away from the single OS environment to a cluster with numerous heterogeneous OSes. However, existing HPC cluster management still assumes that all nodes are equipped with the same OS and fails to consider OS heterogeneity during job scheduling. As a result, such unawareness loses most performance benefits provided by specialized OSes. This paper quantitatively investigates the problem of ignoring OS heterogeneity in the current HPC cluster management and analyzes performance trade-offs inside heterogeneous OSes. Preliminary results on a variety of HPC OSes and applications confirm the performance penalty of the existing cluster scheduler. We then propose a cluster scheduler prototype that incorporates OS heterogeneity into cluster configuration, resource monitoring, and job placement. We also present open challenges for future research on OS heterogeneity aware HPC clusters.","PeriodicalId":149629,"journal":{"name":"Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards OS Heterogeneity Aware Cluster Management for HPC\",\"authors\":\"Zhida An, Ding Li, Yao Guo, Guijin Gao, Yuxin Ren, Ning Jia, Xinwei Hu\",\"doi\":\"10.1145/3609510.3609819\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To achieve extremely high performance in HPC, many researchers have proposed customized operating systems that are tailored to HPC workload characteristics and emerging hardware. Hence, we argue that the HPC cluster will move away from the single OS environment to a cluster with numerous heterogeneous OSes. However, existing HPC cluster management still assumes that all nodes are equipped with the same OS and fails to consider OS heterogeneity during job scheduling. As a result, such unawareness loses most performance benefits provided by specialized OSes. This paper quantitatively investigates the problem of ignoring OS heterogeneity in the current HPC cluster management and analyzes performance trade-offs inside heterogeneous OSes. Preliminary results on a variety of HPC OSes and applications confirm the performance penalty of the existing cluster scheduler. We then propose a cluster scheduler prototype that incorporates OS heterogeneity into cluster configuration, resource monitoring, and job placement. We also present open challenges for future research on OS heterogeneity aware HPC clusters.\",\"PeriodicalId\":149629,\"journal\":{\"name\":\"Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3609510.3609819\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3609510.3609819","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards OS Heterogeneity Aware Cluster Management for HPC
To achieve extremely high performance in HPC, many researchers have proposed customized operating systems that are tailored to HPC workload characteristics and emerging hardware. Hence, we argue that the HPC cluster will move away from the single OS environment to a cluster with numerous heterogeneous OSes. However, existing HPC cluster management still assumes that all nodes are equipped with the same OS and fails to consider OS heterogeneity during job scheduling. As a result, such unawareness loses most performance benefits provided by specialized OSes. This paper quantitatively investigates the problem of ignoring OS heterogeneity in the current HPC cluster management and analyzes performance trade-offs inside heterogeneous OSes. Preliminary results on a variety of HPC OSes and applications confirm the performance penalty of the existing cluster scheduler. We then propose a cluster scheduler prototype that incorporates OS heterogeneity into cluster configuration, resource monitoring, and job placement. We also present open challenges for future research on OS heterogeneity aware HPC clusters.