Abstract: Auto-Tuning of Parallel IO Parameters for HDF5 Applications

2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI:10.1109/SC.Companion.2012.236

Babak Behzad, Joey Huchette, Huong Luu, R. Aydt, Q. Koziol, Prabhat, S. Byna, M. Chaarawi, Yushu Yao

{"title":"Abstract: Auto-Tuning of Parallel IO Parameters for HDF5 Applications","authors":"Babak Behzad, Joey Huchette, Huong Luu, R. Aydt, Q. Koziol, Prabhat, S. Byna, M. Chaarawi, Yushu Yao","doi":"10.1109/SC.Companion.2012.236","DOIUrl":null,"url":null,"abstract":"Parallel I/O is an unavoidable part of modern high-performance computing (HPC), but its system-wide dependencies means it has eluded optimization across platforms and applications. This can introduce bottlenecks in otherwise computationally efficient code, especially as scientific computing becomes increasingly data-driven. Various studies have shown that dramatic improvements are possible when the parameters are set appropriately. However, as a result of having multiple layers in the HPC I/O stack - each with its own optimization parameters-and nontrivial execution time for a test run, finding the optimal parameter values is a very complex problem. Additionally, optimal sets do not necessarily translate between use cases, since tuning I/O performance can be highly dependent on the individual application, the problem size, and the compute platform being used. Tunable parameters are exposed primarily at three levels in the I/O stack: the system, middleware, and high-level data-organization layers. HPC systems need a parallel file system, such as Lustre, to intelligently store data in a parallelized fashion. Middleware communication layers, such as MPI-IO, support this kind of parallel I/O and offer a variety of optimizations, such as collective buffering. Scientists and application developers often use HDF5, a high-level cross-platform I/O library that offers a hierarchical object-database representation of scientific data.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"48 1","pages":"1430-1430"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC.Companion.2012.236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Parallel I/O is an unavoidable part of modern high-performance computing (HPC), but its system-wide dependencies means it has eluded optimization across platforms and applications. This can introduce bottlenecks in otherwise computationally efficient code, especially as scientific computing becomes increasingly data-driven. Various studies have shown that dramatic improvements are possible when the parameters are set appropriately. However, as a result of having multiple layers in the HPC I/O stack - each with its own optimization parameters-and nontrivial execution time for a test run, finding the optimal parameter values is a very complex problem. Additionally, optimal sets do not necessarily translate between use cases, since tuning I/O performance can be highly dependent on the individual application, the problem size, and the compute platform being used. Tunable parameters are exposed primarily at three levels in the I/O stack: the system, middleware, and high-level data-organization layers. HPC systems need a parallel file system, such as Lustre, to intelligently store data in a parallelized fashion. Middleware communication layers, such as MPI-IO, support this kind of parallel I/O and offer a variety of optimizations, such as collective buffering. Scientists and application developers often use HDF5, a high-level cross-platform I/O library that offers a hierarchical object-database representation of scientific data.

查看原文本刊更多论文

HDF5应用中并行IO参数的自动调整

并行I/O是现代高性能计算(HPC)不可避免的一部分，但其系统范围的依赖性意味着它无法跨平台和应用程序进行优化。这可能会给计算效率高的代码带来瓶颈，尤其是在科学计算越来越受到数据驱动的情况下。各种研究表明，当参数设置得当时，可能会有显著的改善。然而，由于在HPC I/O堆栈中有多个层——每个层都有自己的优化参数——并且测试运行的执行时间很长，因此找到最佳参数值是一个非常复杂的问题。此外，最优集不一定在用例之间转换，因为I/O性能调优可能高度依赖于单个应用程序、问题大小和所使用的计算平台。可调参数主要在I/O堆栈中的三个级别上公开:系统层、中间件层和高级数据组织层。HPC系统需要一个并行文件系统，例如Lustre，以并行方式智能地存储数据。中间件通信层(如MPI-IO)支持这种并行I/O，并提供各种优化，如集体缓冲。科学家和应用程序开发人员经常使用HDF5，这是一种高级跨平台I/O库，提供科学数据的分层对象数据库表示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

自引率

0.00%

发文量