Yong Cho, M. Winslett, M. Subramaniam, Ying Chen, S. Kuo, K. Seamons
{"title":"在实际的工作站网络上利用并行阵列I/O中的本地数据","authors":"Yong Cho, M. Winslett, M. Subramaniam, Ying Chen, S. Kuo, K. Seamons","doi":"10.1145/266220.266221","DOIUrl":null,"url":null,"abstract":"A cost-effective way to run a parallel application is to use existing workstations connected by a local area network such as Ethernet or FDDI. In this paper, we present an approach for parallel I/O of multidimensional arrays on small networks of workstations with a shared-media interconnect, using the Panda I/O library. In such an environment, the message passing throughput per node is lower than the throughput obtainable from a fast disk and it is not easy for users to determine the configuration which will yield the best I/O performance. We introduce an I/O strategy that exploits local data to reduce the amount of data that must be shipped across the network, present experimental results, and analyze the results using an analytical performance model and predict the best choice of I/O parameters. Our experiments show that the new strategy results in a factor of 1.2-2.1 speedup in response time compared to the Panda version originally developed for the IBM SP2, depending on the array sizes, distributions and compute and I/O node meshes. Further, the performance model predicts the results within a 13% margin of error.","PeriodicalId":442608,"journal":{"name":"Workshop on I/O in Parallel and Distributed Systems","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"Exploiting local data in parallel array I/O on a practical network of workstations\",\"authors\":\"Yong Cho, M. Winslett, M. Subramaniam, Ying Chen, S. Kuo, K. Seamons\",\"doi\":\"10.1145/266220.266221\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A cost-effective way to run a parallel application is to use existing workstations connected by a local area network such as Ethernet or FDDI. In this paper, we present an approach for parallel I/O of multidimensional arrays on small networks of workstations with a shared-media interconnect, using the Panda I/O library. In such an environment, the message passing throughput per node is lower than the throughput obtainable from a fast disk and it is not easy for users to determine the configuration which will yield the best I/O performance. We introduce an I/O strategy that exploits local data to reduce the amount of data that must be shipped across the network, present experimental results, and analyze the results using an analytical performance model and predict the best choice of I/O parameters. Our experiments show that the new strategy results in a factor of 1.2-2.1 speedup in response time compared to the Panda version originally developed for the IBM SP2, depending on the array sizes, distributions and compute and I/O node meshes. Further, the performance model predicts the results within a 13% margin of error.\",\"PeriodicalId\":442608,\"journal\":{\"name\":\"Workshop on I/O in Parallel and Distributed Systems\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Workshop on I/O in Parallel and Distributed Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/266220.266221\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on I/O in Parallel and Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/266220.266221","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Exploiting local data in parallel array I/O on a practical network of workstations
A cost-effective way to run a parallel application is to use existing workstations connected by a local area network such as Ethernet or FDDI. In this paper, we present an approach for parallel I/O of multidimensional arrays on small networks of workstations with a shared-media interconnect, using the Panda I/O library. In such an environment, the message passing throughput per node is lower than the throughput obtainable from a fast disk and it is not easy for users to determine the configuration which will yield the best I/O performance. We introduce an I/O strategy that exploits local data to reduce the amount of data that must be shipped across the network, present experimental results, and analyze the results using an analytical performance model and predict the best choice of I/O parameters. Our experiments show that the new strategy results in a factor of 1.2-2.1 speedup in response time compared to the Panda version originally developed for the IBM SP2, depending on the array sizes, distributions and compute and I/O node meshes. Further, the performance model predicts the results within a 13% margin of error.