{"title":"Understanding Parallel I/O Performance and Tuning","authors":"S. Byna","doi":"10.1145/3526064.3534114","DOIUrl":null,"url":null,"abstract":"Performance of parallel I/O is critical for large-scale scientific applications to store and access data from parallel file systems on high-performance computing (HPC) systems. These applications use HPC systems often to generate and analyze large amounts of data. They use the parallel I/O software stack for accessing and retrieving data. This stack includes several layers of software libraries - high-level I/O libraries such as HDF5, middleware (MPI-IO), and low-level I/O libraries (POSIX, STD-IO). Each of these layers have complex inter-dependencies among them that impact the I/O performance significantly. As a result, scientific applications frequently spend a large fraction of their execution time in reading and writing data on parallel file systems. These inter-dependencies also complicate tuning parallel I/O performance. A typical parallel I/O performance tuning approach includes collecting performance logs or traces, identifying performance bottlenecks, attributing root causes, and devising optimization strategies. Toward this systematic process, we have done research in collecting Darshan traces for I/O, studying logs on production supercomputing systems, attributing root cause analysis by zooming into application I/O performance, visualizing parallel I/O performance, and applying performance tuning. We will introduce parallel I/O basics, I/O monitoring using various profiling tools, analysis of logs collected on production class supercomputers to identify performance bottlenecks, and application of performance tuning options.We will also describe numerous application use cases and performance improvements.","PeriodicalId":183096,"journal":{"name":"Fifth International Workshop on Systems and Network Telemetry and Analytics","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fifth International Workshop on Systems and Network Telemetry and Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3526064.3534114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Performance of parallel I/O is critical for large-scale scientific applications to store and access data from parallel file systems on high-performance computing (HPC) systems. These applications use HPC systems often to generate and analyze large amounts of data. They use the parallel I/O software stack for accessing and retrieving data. This stack includes several layers of software libraries - high-level I/O libraries such as HDF5, middleware (MPI-IO), and low-level I/O libraries (POSIX, STD-IO). Each of these layers have complex inter-dependencies among them that impact the I/O performance significantly. As a result, scientific applications frequently spend a large fraction of their execution time in reading and writing data on parallel file systems. These inter-dependencies also complicate tuning parallel I/O performance. A typical parallel I/O performance tuning approach includes collecting performance logs or traces, identifying performance bottlenecks, attributing root causes, and devising optimization strategies. Toward this systematic process, we have done research in collecting Darshan traces for I/O, studying logs on production supercomputing systems, attributing root cause analysis by zooming into application I/O performance, visualizing parallel I/O performance, and applying performance tuning. We will introduce parallel I/O basics, I/O monitoring using various profiling tools, analysis of logs collected on production class supercomputers to identify performance bottlenecks, and application of performance tuning options.We will also describe numerous application use cases and performance improvements.