分布式内存机上编译核外数据并行程序中的数据访问重组

Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI:10.1109/IPPS.1997.580956

M. Kandemir, R. Bordawekar, A. Choudhary

{"title":"分布式内存机上编译核外数据并行程序中的数据访问重组","authors":"M. Kandemir, R. Bordawekar, A. Choudhary","doi":"10.1109/IPPS.1997.580956","DOIUrl":null,"url":null,"abstract":"This paper describes optimization techniques for translating out-of-core programs written in a data parallel language to message passing node programs with explicit parallel I/O. We demonstrate that straightforward extension of in-core compilation techniques does not work well for out-of-core programs. We then describe how the compiler can optimize the code by (1) determining appropriate file layouts for out-of-core arrays, (2) permuting the loops in the nest(s) to allow efficient file access, and (3) partitioning the available node memory among references based on I/O cost estimation. Our experimental results indicate that these optimizations can reduce the amount of time spent in I/O by as much as an order of magnitude.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Data access reorganizations in compiling out-of-core data parallel programs on distributed memory machines\",\"authors\":\"M. Kandemir, R. Bordawekar, A. Choudhary\",\"doi\":\"10.1109/IPPS.1997.580956\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes optimization techniques for translating out-of-core programs written in a data parallel language to message passing node programs with explicit parallel I/O. We demonstrate that straightforward extension of in-core compilation techniques does not work well for out-of-core programs. We then describe how the compiler can optimize the code by (1) determining appropriate file layouts for out-of-core arrays, (2) permuting the loops in the nest(s) to allow efficient file access, and (3) partitioning the available node memory among references based on I/O cost estimation. Our experimental results indicate that these optimizations can reduce the amount of time spent in I/O by as much as an order of magnitude.\",\"PeriodicalId\":145892,\"journal\":{\"name\":\"Proceedings 11th International Parallel Processing Symposium\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 11th International Parallel Processing Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPPS.1997.580956\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 11th International Parallel Processing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPPS.1997.580956","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 25

摘要

本文描述了将用数据并行语言编写的核外程序转换为具有显式并行I/O的消息传递节点程序的优化技术。我们证明了内核内编译技术的直接扩展并不适用于内核外的程序。然后，我们描述了编译器如何通过以下方式优化代码:(1)为核外阵列确定适当的文件布局，(2)在嵌套中排列循环以允许有效的文件访问，以及(3)基于I/O成本估算在引用之间划分可用节点内存。我们的实验结果表明，这些优化可以将花在I/O上的时间减少一个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Data access reorganizations in compiling out-of-core data parallel programs on distributed memory machines

This paper describes optimization techniques for translating out-of-core programs written in a data parallel language to message passing node programs with explicit parallel I/O. We demonstrate that straightforward extension of in-core compilation techniques does not work well for out-of-core programs. We then describe how the compiler can optimize the code by (1) determining appropriate file layouts for out-of-core arrays, (2) permuting the loops in the nest(s) to allow efficient file access, and (3) partitioning the available node memory among references based on I/O cost estimation. Our experimental results indicate that these optimizations can reduce the amount of time spent in I/O by as much as an order of magnitude.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings 11th International Parallel Processing Symposium

自引率

0.00%

发文量