用于大型超级计算机的Zest Checkpoint存储系统

2008 3rd Petascale Data Storage Workshop Pub Date : 2008-11-01 DOI:10.1109/PDSW.2008.4811883

P. Nowoczynski, N. Stone, J. Yanovich, J. Sommerfield

{"title":"用于大型超级计算机的Zest Checkpoint存储系统","authors":"P. Nowoczynski, N. Stone, J. Yanovich, J. Sommerfield","doi":"10.1109/PDSW.2008.4811883","DOIUrl":null,"url":null,"abstract":"The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on large compute platforms. Write bandwidth, more than read bandwidth, is the dominant bottleneck in HPC I/O scenarios due to writing checkpoint data, visualization data and post-processing (multi-stage) data. We have prototyped a scalable solution that will be directly applicable to future petascale compute platforms having of order 10^6 cores. Our design emphasizes high-efficiency scalability, low-cost commodity components, lightweight software layers, end-to-end parallelism, client-side caching and software parity, and a unique model of load-balancing outgoing I/O onto high-speed intermediate storage followed by asynchronous reconstruction to a 3rd-party parallel file system.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Zest Checkpoint storage system for large supercomputers\",\"authors\":\"P. Nowoczynski, N. Stone, J. Yanovich, J. Sommerfield\",\"doi\":\"10.1109/PDSW.2008.4811883\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on large compute platforms. Write bandwidth, more than read bandwidth, is the dominant bottleneck in HPC I/O scenarios due to writing checkpoint data, visualization data and post-processing (multi-stage) data. We have prototyped a scalable solution that will be directly applicable to future petascale compute platforms having of order 10^6 cores. Our design emphasizes high-efficiency scalability, low-cost commodity components, lightweight software layers, end-to-end parallelism, client-side caching and software parity, and a unique model of load-balancing outgoing I/O onto high-speed intermediate storage followed by asynchronous reconstruction to a 3rd-party parallel file system.\",\"PeriodicalId\":227342,\"journal\":{\"name\":\"2008 3rd Petascale Data Storage Workshop\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 3rd Petascale Data Storage Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDSW.2008.4811883\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 3rd Petascale Data Storage Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDSW.2008.4811883","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 34

摘要

PSC已经开发了一个原型分布式文件系统基础设施，它极大地加速了大型计算平台上的聚合写带宽。由于写入检查点数据、可视化数据和后处理(多阶段)数据，写带宽比读带宽更成为HPC I/O场景中的主要瓶颈。我们已经设计了一个可扩展的解决方案原型，它将直接适用于未来具有10^6个内核的千万亿次计算平台。我们的设计强调高效的可扩展性、低成本的商品组件、轻量级软件层、端到端并行、客户端缓存和软件奇偶性，以及一个独特的负载平衡模型，即在高速中间存储上输出I/O，然后异步重构到第三方并行文件系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Zest Checkpoint storage system for large supercomputers

The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on large compute platforms. Write bandwidth, more than read bandwidth, is the dominant bottleneck in HPC I/O scenarios due to writing checkpoint data, visualization data and post-processing (multi-stage) data. We have prototyped a scalable solution that will be directly applicable to future petascale compute platforms having of order 10^6 cores. Our design emphasizes high-efficiency scalability, low-cost commodity components, lightweight software layers, end-to-end parallelism, client-side caching and software parity, and a unique model of load-balancing outgoing I/O onto high-speed intermediate storage followed by asynchronous reconstruction to a 3rd-party parallel file system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2008 3rd Petascale Data Storage Workshop

自引率

0.00%

发文量