{"title":"云平台上高性能计算的可扩展系统和软件架构","authors":"Risshab Srinivas Ramesh","doi":"arxiv-2408.10281","DOIUrl":null,"url":null,"abstract":"High-performance computing (HPC) is essential for tackling complex\ncomputational problems across various domains. As the scale and complexity of\nHPC applications continue to grow, the need for scalable systems and software\narchitectures becomes paramount. This paper provides a comprehensive overview\nof architecture for HPC on premise focusing on both hardware and software\naspects and details the associated challenges in building the HPC cluster on\npremise. It explores design principles, challenges, and emerging trends in\nbuilding scalable HPC systems and software, addressing issues such as\nparallelism, memory hierarchy, communication overhead, and fault tolerance on\nvarious cloud platforms. By synthesizing research findings and technological\nadvancements, this paper aims to provide insights into scalable solutions for\nmeeting the evolving demands of HPC applications on cloud.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"51 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable Systems and Software Architectures for High-Performance Computing on cloud platforms\",\"authors\":\"Risshab Srinivas Ramesh\",\"doi\":\"arxiv-2408.10281\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-performance computing (HPC) is essential for tackling complex\\ncomputational problems across various domains. As the scale and complexity of\\nHPC applications continue to grow, the need for scalable systems and software\\narchitectures becomes paramount. This paper provides a comprehensive overview\\nof architecture for HPC on premise focusing on both hardware and software\\naspects and details the associated challenges in building the HPC cluster on\\npremise. It explores design principles, challenges, and emerging trends in\\nbuilding scalable HPC systems and software, addressing issues such as\\nparallelism, memory hierarchy, communication overhead, and fault tolerance on\\nvarious cloud platforms. By synthesizing research findings and technological\\nadvancements, this paper aims to provide insights into scalable solutions for\\nmeeting the evolving demands of HPC applications on cloud.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"51 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.10281\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.10281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scalable Systems and Software Architectures for High-Performance Computing on cloud platforms
High-performance computing (HPC) is essential for tackling complex
computational problems across various domains. As the scale and complexity of
HPC applications continue to grow, the need for scalable systems and software
architectures becomes paramount. This paper provides a comprehensive overview
of architecture for HPC on premise focusing on both hardware and software
aspects and details the associated challenges in building the HPC cluster on
premise. It explores design principles, challenges, and emerging trends in
building scalable HPC systems and software, addressing issues such as
parallelism, memory hierarchy, communication overhead, and fault tolerance on
various cloud platforms. By synthesizing research findings and technological
advancements, this paper aims to provide insights into scalable solutions for
meeting the evolving demands of HPC applications on cloud.