{"title":"作业的空间布局对并行I/O性能的影响","authors":"Jens Mache, V. Lo, M. Livingston, Sharad Garg","doi":"10.1145/301816.301830","DOIUrl":null,"url":null,"abstract":"Input/Output is a big obstacle to effective use of tenflopsscale computing systems, Motivated by earlier parallel I/O meaurements on an Intel TFLOPS machine, we conduct studies to determine the sensitivity of parallel I/O performance on multi-progmmmed mesh-connected machines with respect to number of I/O nodes, number of compute nodes, network link bandwidth, I/O node bandwidth, spatial layout of jobs, and read or write demands of applications. Our extensive simulations and analytical modeling yield important insights into the limitations on parallel I/O performance due to network contention, and into the possible gains in parallel I/O performance that can be achieved by tuning the spatial layout of jobs. Applying these results, we devise a new processor allocation strategy that is sensitive to parallel I/O traffic and the resulting network contention. In performance evaluations driven by synthetic workloads and by a real workload trace captured at the San Diego Supercomputing Center, the new strategy improves the average response time of parallel I/O intensive jobs by up to a factor of 4.5.","PeriodicalId":442608,"journal":{"name":"Workshop on I/O in Parallel and Distributed Systems","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"The impact of spatial layout of jobs on parallel I/O performance\",\"authors\":\"Jens Mache, V. Lo, M. Livingston, Sharad Garg\",\"doi\":\"10.1145/301816.301830\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Input/Output is a big obstacle to effective use of tenflopsscale computing systems, Motivated by earlier parallel I/O meaurements on an Intel TFLOPS machine, we conduct studies to determine the sensitivity of parallel I/O performance on multi-progmmmed mesh-connected machines with respect to number of I/O nodes, number of compute nodes, network link bandwidth, I/O node bandwidth, spatial layout of jobs, and read or write demands of applications. Our extensive simulations and analytical modeling yield important insights into the limitations on parallel I/O performance due to network contention, and into the possible gains in parallel I/O performance that can be achieved by tuning the spatial layout of jobs. Applying these results, we devise a new processor allocation strategy that is sensitive to parallel I/O traffic and the resulting network contention. In performance evaluations driven by synthetic workloads and by a real workload trace captured at the San Diego Supercomputing Center, the new strategy improves the average response time of parallel I/O intensive jobs by up to a factor of 4.5.\",\"PeriodicalId\":442608,\"journal\":{\"name\":\"Workshop on I/O in Parallel and Distributed Systems\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Workshop on I/O in Parallel and Distributed Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/301816.301830\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on I/O in Parallel and Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/301816.301830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
摘要
输入/输出是有效使用tenflops规模计算系统的一大障碍,受早期英特尔TFLOPS机器上并行I/O测量的启发,我们进行了研究,以确定多编程网格连接机器上并行I/O性能的敏感性,涉及I/O节点数量、计算节点数量、网络链路带宽、I/O节点带宽、作业空间布局和应用程序的读写需求。通过广泛的模拟和分析建模,我们对网络争用对并行I/O性能的限制以及通过调整作业的空间布局可以实现的并行I/O性能的可能增益有了重要的见解。应用这些结果,我们设计了一种新的处理器分配策略,该策略对并行I/O流量和由此产生的网络争用很敏感。在由合成工作负载和在San Diego Supercomputing Center捕获的真实工作负载跟踪驱动的性能评估中,新策略将并行I/O密集型作业的平均响应时间提高了4.5倍。
The impact of spatial layout of jobs on parallel I/O performance
Input/Output is a big obstacle to effective use of tenflopsscale computing systems, Motivated by earlier parallel I/O meaurements on an Intel TFLOPS machine, we conduct studies to determine the sensitivity of parallel I/O performance on multi-progmmmed mesh-connected machines with respect to number of I/O nodes, number of compute nodes, network link bandwidth, I/O node bandwidth, spatial layout of jobs, and read or write demands of applications. Our extensive simulations and analytical modeling yield important insights into the limitations on parallel I/O performance due to network contention, and into the possible gains in parallel I/O performance that can be achieved by tuning the spatial layout of jobs. Applying these results, we devise a new processor allocation strategy that is sensitive to parallel I/O traffic and the resulting network contention. In performance evaluations driven by synthetic workloads and by a real workload trace captured at the San Diego Supercomputing Center, the new strategy improves the average response time of parallel I/O intensive jobs by up to a factor of 4.5.