{"title":"高级资源管理:HPC和云计算的实践硕士课程","authors":"Lucia Pons, Salvador Petit, Julio Sahuquillo","doi":"10.1016/j.jpdc.2025.105091","DOIUrl":null,"url":null,"abstract":"<div><div>Resource management has become a major concern in dealing with performance and fairness in recent computing servers, including a wide variety of shared resources. To achieve high-performing and efficient systems, both hardware and software engineers must be thoroughly trained in effective resource management techniques. This paper introduces the GRE master course (Spanish acronym for Resource Management and Performance Evaluation in Cloud and High-Performance Workloads), which is being offered since Fall 2023. The course is taught by instructors with broad research expertise in resource management and performance evaluation. Subjects covered in this course include workload characterization, state-of-the-art resource management approaches, and performance evaluation tools and methodologies used in production systems. Management techniques are studied both in the context of HPC and cloud computing, where resource efficiency is becoming a primary concern. To enhance the learning experience, the course integrates theoretical concepts with a wide set of hands-on tasks carried out on recent real platforms. A real cloud virtualized environment is mimicked using typical software deployed in production systems such as Proxmox Virtual Environment. Students learn to use tools such as Linux Perf and Intel Vtune Profiler, which are commonly employed by researchers and practitioners to carry out typical tasks like performance bottleneck analysis from a microarchitectural perspective. Overall, the GRE course provides students with a solid foundation and skills in resource management by addressing current hot topics both in the industry and academia. Student satisfaction and learning outcomes prove the success of the GRE course and encourage us to continue in this direction.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"202 ","pages":"Article 105091"},"PeriodicalIF":3.4000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Advanced resource management: A hands-on master course in HPC and cloud computing\",\"authors\":\"Lucia Pons, Salvador Petit, Julio Sahuquillo\",\"doi\":\"10.1016/j.jpdc.2025.105091\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Resource management has become a major concern in dealing with performance and fairness in recent computing servers, including a wide variety of shared resources. To achieve high-performing and efficient systems, both hardware and software engineers must be thoroughly trained in effective resource management techniques. This paper introduces the GRE master course (Spanish acronym for Resource Management and Performance Evaluation in Cloud and High-Performance Workloads), which is being offered since Fall 2023. The course is taught by instructors with broad research expertise in resource management and performance evaluation. Subjects covered in this course include workload characterization, state-of-the-art resource management approaches, and performance evaluation tools and methodologies used in production systems. Management techniques are studied both in the context of HPC and cloud computing, where resource efficiency is becoming a primary concern. To enhance the learning experience, the course integrates theoretical concepts with a wide set of hands-on tasks carried out on recent real platforms. A real cloud virtualized environment is mimicked using typical software deployed in production systems such as Proxmox Virtual Environment. Students learn to use tools such as Linux Perf and Intel Vtune Profiler, which are commonly employed by researchers and practitioners to carry out typical tasks like performance bottleneck analysis from a microarchitectural perspective. Overall, the GRE course provides students with a solid foundation and skills in resource management by addressing current hot topics both in the industry and academia. Student satisfaction and learning outcomes prove the success of the GRE course and encourage us to continue in this direction.</div></div>\",\"PeriodicalId\":54775,\"journal\":{\"name\":\"Journal of Parallel and Distributed Computing\",\"volume\":\"202 \",\"pages\":\"Article 105091\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Parallel and Distributed Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0743731525000589\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Parallel and Distributed Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0743731525000589","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Advanced resource management: A hands-on master course in HPC and cloud computing
Resource management has become a major concern in dealing with performance and fairness in recent computing servers, including a wide variety of shared resources. To achieve high-performing and efficient systems, both hardware and software engineers must be thoroughly trained in effective resource management techniques. This paper introduces the GRE master course (Spanish acronym for Resource Management and Performance Evaluation in Cloud and High-Performance Workloads), which is being offered since Fall 2023. The course is taught by instructors with broad research expertise in resource management and performance evaluation. Subjects covered in this course include workload characterization, state-of-the-art resource management approaches, and performance evaluation tools and methodologies used in production systems. Management techniques are studied both in the context of HPC and cloud computing, where resource efficiency is becoming a primary concern. To enhance the learning experience, the course integrates theoretical concepts with a wide set of hands-on tasks carried out on recent real platforms. A real cloud virtualized environment is mimicked using typical software deployed in production systems such as Proxmox Virtual Environment. Students learn to use tools such as Linux Perf and Intel Vtune Profiler, which are commonly employed by researchers and practitioners to carry out typical tasks like performance bottleneck analysis from a microarchitectural perspective. Overall, the GRE course provides students with a solid foundation and skills in resource management by addressing current hot topics both in the industry and academia. Student satisfaction and learning outcomes prove the success of the GRE course and encourage us to continue in this direction.
期刊介绍:
This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing.
The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems.