Rethinking High Performance Computing Platforms: Challenges, Opportunities and Recommendations

Ole Weidner, M. Atkinson, A. Barker, Rosa Filgueira Vicente
{"title":"Rethinking High Performance Computing Platforms: Challenges, Opportunities and Recommendations","authors":"Ole Weidner, M. Atkinson, A. Barker, Rosa Filgueira Vicente","doi":"10.1145/2912152.2912155","DOIUrl":null,"url":null,"abstract":"A growing number of \"second generation\" high-performance computing applications with heterogeneous, dynamic and data-intensive properties have an extended set of requirements, which cover application deployment, resource allocation, -control, and I/O scheduling. These requirements are not met by the current production HPC platform models and policies. This results in a loss of opportunity, productivity and innovation for new computational methods and tools. It also decreases effective system utilization for platform providers due to unsupervised workarounds and \"rogue'\" resource management strategies implemented in application space. In this paper we critically discuss the dominant HPC platform model and describe the challenges it creates for second generation applications because of its asymmetric resource view, interfaces and software deployment policies. We present an extended, more symmetric and application-centric platform model that adds decentralized deployment, introspection, bidirectional control and information flow and more comprehensive resource scheduling. We describe cHPC: an early prototype of a non-disruptive implementation based on Linux Containers (LXC). It can operate alongside existing batch queuing systems and exposes a symmetric platform API without interfering with existing applications and usage modes. We see our approach as a viable, incremental next step in HPC platform evolution that benefits applications and platform providers alike. To demonstrate this further, we layout out a roadmap for future research and experimental evaluation.","PeriodicalId":443897,"journal":{"name":"Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2912152.2912155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

A growing number of "second generation" high-performance computing applications with heterogeneous, dynamic and data-intensive properties have an extended set of requirements, which cover application deployment, resource allocation, -control, and I/O scheduling. These requirements are not met by the current production HPC platform models and policies. This results in a loss of opportunity, productivity and innovation for new computational methods and tools. It also decreases effective system utilization for platform providers due to unsupervised workarounds and "rogue'" resource management strategies implemented in application space. In this paper we critically discuss the dominant HPC platform model and describe the challenges it creates for second generation applications because of its asymmetric resource view, interfaces and software deployment policies. We present an extended, more symmetric and application-centric platform model that adds decentralized deployment, introspection, bidirectional control and information flow and more comprehensive resource scheduling. We describe cHPC: an early prototype of a non-disruptive implementation based on Linux Containers (LXC). It can operate alongside existing batch queuing systems and exposes a symmetric platform API without interfering with existing applications and usage modes. We see our approach as a viable, incremental next step in HPC platform evolution that benefits applications and platform providers alike. To demonstrate this further, we layout out a roadmap for future research and experimental evaluation.
重新思考高性能计算平台:挑战、机遇和建议
越来越多的具有异构、动态和数据密集型属性的“第二代”高性能计算应用程序具有扩展的需求集,其中包括应用程序部署、资源分配、控制和I/O调度。目前的生产HPC平台模型和策略无法满足这些要求。这导致了新计算方法和工具的机会、生产力和创新的丧失。由于在应用程序空间中实现了不受监督的工作环境和“流氓”资源管理策略,它还降低了平台提供商的有效系统利用率。在本文中,我们批判性地讨论了占主导地位的HPC平台模型,并描述了由于其不对称的资源视图、接口和软件部署策略,它为第二代应用程序带来的挑战。我们提出了一个扩展的、更对称的、以应用为中心的平台模型,它增加了分散部署、自省、双向控制和信息流以及更全面的资源调度。我们描述了cHPC:一个基于Linux容器(LXC)的非中断实现的早期原型。它可以与现有的批处理排队系统一起操作,并公开一个对称的平台API,而不会干扰现有的应用程序和使用模式。我们认为我们的方法是HPC平台发展的一个可行的、渐进的下一步,对应用程序和平台提供商都有好处。为了进一步证明这一点,我们为未来的研究和实验评估制定了路线图。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信