Sandra Catalán, X. Martorell, Jesús Labarta, Tetsuzo Usui, Leonel Antonio Toledo Díaz, Pedro Valero-Lara
{"title":"利用omps加速共轭梯度","authors":"Sandra Catalán, X. Martorell, Jesús Labarta, Tetsuzo Usui, Leonel Antonio Toledo Díaz, Pedro Valero-Lara","doi":"10.1109/PDCAT46702.2019.00033","DOIUrl":null,"url":null,"abstract":"In this paper, we present the benefits of using the clause concurrent of OmpSs when performing reductions, more specifically, when applied to the dot product (DOT) operations. We analyze its benefits through the implementation of different versions of the Conjugate Gradient (CG) method. We start from a parallel version of the code based on tasks and dependencies; later, we introduce the use of the concurrent clause, which allows to overlap the execution of tasks that have data dependencies among them. In this way, we want to show the benefits of the concurrent clause, which might be included in OpenMP standard as previously done with other OmpSs features. Our tests, performed on a single node of the (Intel-based) Marenostrum 4 Supercomputer and a single socket of the (ARM-based) Dibona cluster, show that the use of the concurrent clause may improve performance with respect to the version where only tasks and dependencies are used around 37% and 23% respectively.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"56 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Accelerating Conjugate Gradient using OmpSs\",\"authors\":\"Sandra Catalán, X. Martorell, Jesús Labarta, Tetsuzo Usui, Leonel Antonio Toledo Díaz, Pedro Valero-Lara\",\"doi\":\"10.1109/PDCAT46702.2019.00033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present the benefits of using the clause concurrent of OmpSs when performing reductions, more specifically, when applied to the dot product (DOT) operations. We analyze its benefits through the implementation of different versions of the Conjugate Gradient (CG) method. We start from a parallel version of the code based on tasks and dependencies; later, we introduce the use of the concurrent clause, which allows to overlap the execution of tasks that have data dependencies among them. In this way, we want to show the benefits of the concurrent clause, which might be included in OpenMP standard as previously done with other OmpSs features. Our tests, performed on a single node of the (Intel-based) Marenostrum 4 Supercomputer and a single socket of the (ARM-based) Dibona cluster, show that the use of the concurrent clause may improve performance with respect to the version where only tasks and dependencies are used around 37% and 23% respectively.\",\"PeriodicalId\":166126,\"journal\":{\"name\":\"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"volume\":\"56 3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT46702.2019.00033\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT46702.2019.00033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In this paper, we present the benefits of using the clause concurrent of OmpSs when performing reductions, more specifically, when applied to the dot product (DOT) operations. We analyze its benefits through the implementation of different versions of the Conjugate Gradient (CG) method. We start from a parallel version of the code based on tasks and dependencies; later, we introduce the use of the concurrent clause, which allows to overlap the execution of tasks that have data dependencies among them. In this way, we want to show the benefits of the concurrent clause, which might be included in OpenMP standard as previously done with other OmpSs features. Our tests, performed on a single node of the (Intel-based) Marenostrum 4 Supercomputer and a single socket of the (ARM-based) Dibona cluster, show that the use of the concurrent clause may improve performance with respect to the version where only tasks and dependencies are used around 37% and 23% respectively.