{"title":"面向I/O密集型应用的任务池并行I/O范式","authors":"Jianjiang Li, Lin Yan, Zhe Gao, D. Hei","doi":"10.1109/ISPA.2009.20","DOIUrl":null,"url":null,"abstract":"In regards to applications like 3D seismic migration, it is quite important to improve the I/O performance within an cluster computing system. Such seismic data processing applications are the I/O intensive applications. For example, large 3D data volume cannot be hold totally in computer memories. Therefore the input data files have to be divided into many fine-grained chunks. Intermediate results are written out at various stages during the execution, and final results are written out by the master process. This paper describes a novel manner for optimizing the parallel I/O data access strategy and load balancing for the above-mentioned particular program model. The optimization, based on the application defined API, reduces the number of I/O operations and communication (as compared to the original model). This is done by forming groups of threads with \"group roots\", so to speak, that read input data (determined by an index retrieved from the master process) and then send it to their group members. In the original model, each process/thread reads the whole input data and outputs its own results. Moreover the loads are balanced, for the on-line dynamic scheduling of access request to process the migration data. Finally, in the actual performance test, the improvement of performance is often more than 60% by comparison with the original model.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Task-Pool Parallel I/O Paradigm for an I/O Intensive Application\",\"authors\":\"Jianjiang Li, Lin Yan, Zhe Gao, D. Hei\",\"doi\":\"10.1109/ISPA.2009.20\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In regards to applications like 3D seismic migration, it is quite important to improve the I/O performance within an cluster computing system. Such seismic data processing applications are the I/O intensive applications. For example, large 3D data volume cannot be hold totally in computer memories. Therefore the input data files have to be divided into many fine-grained chunks. Intermediate results are written out at various stages during the execution, and final results are written out by the master process. This paper describes a novel manner for optimizing the parallel I/O data access strategy and load balancing for the above-mentioned particular program model. The optimization, based on the application defined API, reduces the number of I/O operations and communication (as compared to the original model). This is done by forming groups of threads with \\\"group roots\\\", so to speak, that read input data (determined by an index retrieved from the master process) and then send it to their group members. In the original model, each process/thread reads the whole input data and outputs its own results. Moreover the loads are balanced, for the on-line dynamic scheduling of access request to process the migration data. Finally, in the actual performance test, the improvement of performance is often more than 60% by comparison with the original model.\",\"PeriodicalId\":346815,\"journal\":{\"name\":\"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPA.2009.20\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPA.2009.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Task-Pool Parallel I/O Paradigm for an I/O Intensive Application
In regards to applications like 3D seismic migration, it is quite important to improve the I/O performance within an cluster computing system. Such seismic data processing applications are the I/O intensive applications. For example, large 3D data volume cannot be hold totally in computer memories. Therefore the input data files have to be divided into many fine-grained chunks. Intermediate results are written out at various stages during the execution, and final results are written out by the master process. This paper describes a novel manner for optimizing the parallel I/O data access strategy and load balancing for the above-mentioned particular program model. The optimization, based on the application defined API, reduces the number of I/O operations and communication (as compared to the original model). This is done by forming groups of threads with "group roots", so to speak, that read input data (determined by an index retrieved from the master process) and then send it to their group members. In the original model, each process/thread reads the whole input data and outputs its own results. Moreover the loads are balanced, for the on-line dynamic scheduling of access request to process the migration data. Finally, in the actual performance test, the improvement of performance is often more than 60% by comparison with the original model.