{"title":"HPC@SCALE:培训下一代HPC软件架构师的实践方法","authors":"T. Islam, Chase Phelps","doi":"10.1109/HiPCW54834.2021.00011","DOIUrl":null,"url":null,"abstract":"High Performance Computing (HPC) systems enable multi-scale simulations to gain meaningful insights into otherwise experimentally intractable phenomena such as climate change and destabilizing drug-protein interactions to cure cancer. High levels of parallelism and heterogeneous architectures in these systems offer unprecedented computational capability at the cost of complexity and dynamism. Recent efforts in workforce development have focused on preparing students with the background to write a program for these complex heterogeneous platforms successfully [1]. However, computing is only one of the three tasks an HPC application performs; the other two are communication and I/O. Since network bandwidth is not scaling proportionately with computational capabilities, moving the large volume of data generated by these applications through the network slows down scientific progress. A high-level datadriven analysis shows that most existing curricula do not prepare students to consider design choices to scale parallel I/O, which is a crucial building component of an end-to-end system. At its core, the problem of scaling data-intensive applications is common in both high-performance, high-throughput, and Cloud computing environments, so any training in that regard will have a broad impact. To fill this gap, we have designed a new course called HPC@SCALE to train students at Texas State University in building scalable end-to-end system software focusing on minimizing parallel I/O.","PeriodicalId":227669,"journal":{"name":"2021 IEEE 28th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HPC@SCALE: A Hands-on Approach for Training Next-Gen HPC Software Architects\",\"authors\":\"T. Islam, Chase Phelps\",\"doi\":\"10.1109/HiPCW54834.2021.00011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High Performance Computing (HPC) systems enable multi-scale simulations to gain meaningful insights into otherwise experimentally intractable phenomena such as climate change and destabilizing drug-protein interactions to cure cancer. High levels of parallelism and heterogeneous architectures in these systems offer unprecedented computational capability at the cost of complexity and dynamism. Recent efforts in workforce development have focused on preparing students with the background to write a program for these complex heterogeneous platforms successfully [1]. However, computing is only one of the three tasks an HPC application performs; the other two are communication and I/O. Since network bandwidth is not scaling proportionately with computational capabilities, moving the large volume of data generated by these applications through the network slows down scientific progress. A high-level datadriven analysis shows that most existing curricula do not prepare students to consider design choices to scale parallel I/O, which is a crucial building component of an end-to-end system. At its core, the problem of scaling data-intensive applications is common in both high-performance, high-throughput, and Cloud computing environments, so any training in that regard will have a broad impact. To fill this gap, we have designed a new course called HPC@SCALE to train students at Texas State University in building scalable end-to-end system software focusing on minimizing parallel I/O.\",\"PeriodicalId\":227669,\"journal\":{\"name\":\"2021 IEEE 28th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)\",\"volume\":\"120 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 28th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HiPCW54834.2021.00011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 28th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPCW54834.2021.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
HPC@SCALE: A Hands-on Approach for Training Next-Gen HPC Software Architects
High Performance Computing (HPC) systems enable multi-scale simulations to gain meaningful insights into otherwise experimentally intractable phenomena such as climate change and destabilizing drug-protein interactions to cure cancer. High levels of parallelism and heterogeneous architectures in these systems offer unprecedented computational capability at the cost of complexity and dynamism. Recent efforts in workforce development have focused on preparing students with the background to write a program for these complex heterogeneous platforms successfully [1]. However, computing is only one of the three tasks an HPC application performs; the other two are communication and I/O. Since network bandwidth is not scaling proportionately with computational capabilities, moving the large volume of data generated by these applications through the network slows down scientific progress. A high-level datadriven analysis shows that most existing curricula do not prepare students to consider design choices to scale parallel I/O, which is a crucial building component of an end-to-end system. At its core, the problem of scaling data-intensive applications is common in both high-performance, high-throughput, and Cloud computing environments, so any training in that regard will have a broad impact. To fill this gap, we have designed a new course called HPC@SCALE to train students at Texas State University in building scalable end-to-end system software focusing on minimizing parallel I/O.