{"title":"HPF2中的异步渐进式不规则前缀操作","authors":"Frédéric Brégier, M. Counilh, J. Roman","doi":"10.1109/EMPDP.2000.823421","DOIUrl":null,"url":null,"abstract":"In this paper we study one kind of irregular computation on distributed arrays, the irregular prefix operation, that is currently not well taken into account by the standard data-parallel language HPF2. We show a parallel implementation that efficiently takes advantage of the independent computations arising in this irregular operation. Our approach is based on the use of a directive which characterizes an irregular prefix operation and on inspector/executor support, implemented in the CoLuMBO library, which optimizes the execution by using an asynchronous communication scheme and then communication/computation overlap. We validate our contribution with results achieved on IBM SP2 for basic experiments and for a sparse Cholesky factorization algorithm applied to real size problems.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Asynchronous progressive irregular prefix operation in HPF2\",\"authors\":\"Frédéric Brégier, M. Counilh, J. Roman\",\"doi\":\"10.1109/EMPDP.2000.823421\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we study one kind of irregular computation on distributed arrays, the irregular prefix operation, that is currently not well taken into account by the standard data-parallel language HPF2. We show a parallel implementation that efficiently takes advantage of the independent computations arising in this irregular operation. Our approach is based on the use of a directive which characterizes an irregular prefix operation and on inspector/executor support, implemented in the CoLuMBO library, which optimizes the execution by using an asynchronous communication scheme and then communication/computation overlap. We validate our contribution with results achieved on IBM SP2 for basic experiments and for a sparse Cholesky factorization algorithm applied to real size problems.\",\"PeriodicalId\":128020,\"journal\":{\"name\":\"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-01-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EMPDP.2000.823421\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EMPDP.2000.823421","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Asynchronous progressive irregular prefix operation in HPF2
In this paper we study one kind of irregular computation on distributed arrays, the irregular prefix operation, that is currently not well taken into account by the standard data-parallel language HPF2. We show a parallel implementation that efficiently takes advantage of the independent computations arising in this irregular operation. Our approach is based on the use of a directive which characterizes an irregular prefix operation and on inspector/executor support, implemented in the CoLuMBO library, which optimizes the execution by using an asynchronous communication scheme and then communication/computation overlap. We validate our contribution with results achieved on IBM SP2 for basic experiments and for a sparse Cholesky factorization algorithm applied to real size problems.