HPC系统的操作和运行时系统挑战

A. Maccabe
{"title":"HPC系统的操作和运行时系统挑战","authors":"A. Maccabe","doi":"10.1145/3095770.3095771","DOIUrl":null,"url":null,"abstract":"Future HPC systems will be characterized by extreme heterogeneity. We will see increasing heterogeneity in virtually every aspect of node architecture from computational engines to memory systems. We will see increasing heterogeneity in applications, including heterogeneity within applications (as previously independent applications are composed to build new applications). We will see increasing heterogeneity in system usage models; in some cases, the HPC system is not the most precious resource being managed. We will also see increasing heterogeneity in the shared services (e.g., storage and visualization systems) that are connected to HPC systems. All of this increasing heterogeneity is certain to create new challenges in the design and implementation of operating and runtime systems. There will be new kinds of resources to manage and many resource management tactics will be invented (and some re-discovered and adapted) to address the new heterogeneity. In essence, we will tacitly agree that the operating and runtime systems need to adapt to enable the inevitable integration of new technologies, applications, usage models, and shared services. While this agreement is critical for our ability to make incremental progress, we, as a community, must step back and ask the relevant question: Does the OS or runtime system bear the brunt of the adaptation, or will we be able to insist on changes in the technologies, applications, and environment? In the past decade, we have seen a similar tradeoff play out between the application teams and the architects of computational engines: how much floating point precision is required and how is this precision implemented? How can we define similar tradeoffs that are important in the design and implementation of operating and runtime systems?","PeriodicalId":205790,"journal":{"name":"Proceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers ROSS 2017","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Operating and Runtime Systems Challenges for HPC Systems\",\"authors\":\"A. Maccabe\",\"doi\":\"10.1145/3095770.3095771\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Future HPC systems will be characterized by extreme heterogeneity. We will see increasing heterogeneity in virtually every aspect of node architecture from computational engines to memory systems. We will see increasing heterogeneity in applications, including heterogeneity within applications (as previously independent applications are composed to build new applications). We will see increasing heterogeneity in system usage models; in some cases, the HPC system is not the most precious resource being managed. We will also see increasing heterogeneity in the shared services (e.g., storage and visualization systems) that are connected to HPC systems. All of this increasing heterogeneity is certain to create new challenges in the design and implementation of operating and runtime systems. There will be new kinds of resources to manage and many resource management tactics will be invented (and some re-discovered and adapted) to address the new heterogeneity. In essence, we will tacitly agree that the operating and runtime systems need to adapt to enable the inevitable integration of new technologies, applications, usage models, and shared services. While this agreement is critical for our ability to make incremental progress, we, as a community, must step back and ask the relevant question: Does the OS or runtime system bear the brunt of the adaptation, or will we be able to insist on changes in the technologies, applications, and environment? In the past decade, we have seen a similar tradeoff play out between the application teams and the architects of computational engines: how much floating point precision is required and how is this precision implemented? How can we define similar tradeoffs that are important in the design and implementation of operating and runtime systems?\",\"PeriodicalId\":205790,\"journal\":{\"name\":\"Proceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers ROSS 2017\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers ROSS 2017\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3095770.3095771\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers ROSS 2017","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3095770.3095771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

未来的高性能计算系统将以极端的异构性为特征。我们将看到,从计算引擎到内存系统,节点架构的几乎每个方面都在增加异构性。我们将看到应用程序中越来越多的异构性,包括应用程序内部的异构性(因为以前独立的应用程序被组合起来构建新的应用程序)。我们将看到系统使用模型中越来越多的异构性;在某些情况下,高性能计算系统并不是被管理的最宝贵的资源。我们还将看到连接到高性能计算系统的共享服务(如存储和可视化系统)的异构性日益增强。所有这些不断增加的异构性肯定会给操作系统和运行时系统的设计和实现带来新的挑战。将会有新的资源需要管理,将会有许多资源管理策略被发明出来(还有一些被重新发现和调整)来处理新的异质性。从本质上讲,我们将默认操作系统和运行时系统需要适应,以支持新技术、应用程序、使用模型和共享服务的不可避免的集成。虽然这个协议对于我们取得渐进式进展的能力至关重要,但作为一个社区,我们必须退后一步,问一个相关的问题:是操作系统或运行时系统承受了适应的冲击,还是我们能够坚持技术、应用程序和环境的变化?在过去的十年中,我们看到应用程序团队和计算引擎架构师之间出现了类似的权衡:需要多少浮点精度以及如何实现这种精度?我们如何定义在操作系统和运行时系统的设计和实现中重要的类似权衡?
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Operating and Runtime Systems Challenges for HPC Systems
Future HPC systems will be characterized by extreme heterogeneity. We will see increasing heterogeneity in virtually every aspect of node architecture from computational engines to memory systems. We will see increasing heterogeneity in applications, including heterogeneity within applications (as previously independent applications are composed to build new applications). We will see increasing heterogeneity in system usage models; in some cases, the HPC system is not the most precious resource being managed. We will also see increasing heterogeneity in the shared services (e.g., storage and visualization systems) that are connected to HPC systems. All of this increasing heterogeneity is certain to create new challenges in the design and implementation of operating and runtime systems. There will be new kinds of resources to manage and many resource management tactics will be invented (and some re-discovered and adapted) to address the new heterogeneity. In essence, we will tacitly agree that the operating and runtime systems need to adapt to enable the inevitable integration of new technologies, applications, usage models, and shared services. While this agreement is critical for our ability to make incremental progress, we, as a community, must step back and ask the relevant question: Does the OS or runtime system bear the brunt of the adaptation, or will we be able to insist on changes in the technologies, applications, and environment? In the past decade, we have seen a similar tradeoff play out between the application teams and the architects of computational engines: how much floating point precision is required and how is this precision implemented? How can we define similar tradeoffs that are important in the design and implementation of operating and runtime systems?
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信