Mehmet Soysal, M. Berghoff, T. Zirwes, Marc-André Vef, Sebastian Oeste, A. Brinkmann, W. Nagel, A. Streit
{"title":"在HPC环境中使用按需文件系统","authors":"Mehmet Soysal, M. Berghoff, T. Zirwes, Marc-André Vef, Sebastian Oeste, A. Brinkmann, W. Nagel, A. Streit","doi":"10.1109/HPCS48598.2019.9188216","DOIUrl":null,"url":null,"abstract":"In modern HPC systems, parallel (distributed) file systems are used to allow fast access from and to the storage infrastructure. However, I/O performance in large-scale HPC systems has failed to keep up with the increase in computational power. As a result, the I/O subsystem which also has to cope with a large number of demanding metadata operations is often the bottleneck of the entire HPC system. In some cases, even a single bad behaving application can be held responsible for slowing down the entire HPC system, disrupting other applications that use the same I/O subsystem. These kinds of situations are likely to become more frequent in the future with larger and more powerful HPC systems. In this work, we present a simple solution for applications with very high I/O demands. Our proposed solution is to create a private parallel file system on-demand for an HPC job and use the node-local storage devices, e.g. solid-state-disks (SSD). We show that this feature is easy to add to an existing HPC environment and requires only minimal configuration to the system. We conclude that the impact on running applications is manageable and the advantages to applications that generate a high load outweigh the disadvantages. We show that in some cases applications may run slower, but the reduction of load on the global file system is prevailing in these cases.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Using On-Demand File Systems in HPC Environments\",\"authors\":\"Mehmet Soysal, M. Berghoff, T. Zirwes, Marc-André Vef, Sebastian Oeste, A. Brinkmann, W. Nagel, A. Streit\",\"doi\":\"10.1109/HPCS48598.2019.9188216\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In modern HPC systems, parallel (distributed) file systems are used to allow fast access from and to the storage infrastructure. However, I/O performance in large-scale HPC systems has failed to keep up with the increase in computational power. As a result, the I/O subsystem which also has to cope with a large number of demanding metadata operations is often the bottleneck of the entire HPC system. In some cases, even a single bad behaving application can be held responsible for slowing down the entire HPC system, disrupting other applications that use the same I/O subsystem. These kinds of situations are likely to become more frequent in the future with larger and more powerful HPC systems. In this work, we present a simple solution for applications with very high I/O demands. Our proposed solution is to create a private parallel file system on-demand for an HPC job and use the node-local storage devices, e.g. solid-state-disks (SSD). We show that this feature is easy to add to an existing HPC environment and requires only minimal configuration to the system. We conclude that the impact on running applications is manageable and the advantages to applications that generate a high load outweigh the disadvantages. We show that in some cases applications may run slower, but the reduction of load on the global file system is prevailing in these cases.\",\"PeriodicalId\":371856,\"journal\":{\"name\":\"2019 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCS48598.2019.9188216\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS48598.2019.9188216","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In modern HPC systems, parallel (distributed) file systems are used to allow fast access from and to the storage infrastructure. However, I/O performance in large-scale HPC systems has failed to keep up with the increase in computational power. As a result, the I/O subsystem which also has to cope with a large number of demanding metadata operations is often the bottleneck of the entire HPC system. In some cases, even a single bad behaving application can be held responsible for slowing down the entire HPC system, disrupting other applications that use the same I/O subsystem. These kinds of situations are likely to become more frequent in the future with larger and more powerful HPC systems. In this work, we present a simple solution for applications with very high I/O demands. Our proposed solution is to create a private parallel file system on-demand for an HPC job and use the node-local storage devices, e.g. solid-state-disks (SSD). We show that this feature is easy to add to an existing HPC environment and requires only minimal configuration to the system. We conclude that the impact on running applications is manageable and the advantages to applications that generate a high load outweigh the disadvantages. We show that in some cases applications may run slower, but the reduction of load on the global file system is prevailing in these cases.