H. Takizawa, Shinji Shiotsuki, Naoki Ebata, Ryusuke Egawa
{"title":"An OpenCL-Like Offload Programming Framework for SX-Aurora TSUBASA","authors":"H. Takizawa, Shinji Shiotsuki, Naoki Ebata, Ryusuke Egawa","doi":"10.1109/PDCAT46702.2019.00059","DOIUrl":null,"url":null,"abstract":"This paper presents an OpenCL-like offload programming framework for NEC SX-Aurora TSUBASA (SXAurora). Unlike traditional vector systems, one node of an SXAurora system consists of a host processor and some vector processors on PCI-Express cards, which are called a vector host and vector engines, respectively. Since the standard OpenCL execution model does not naturally fit in the vector engine, this paper discusses how to adapt the OpenCL specification to SXAurora while considering the trade off between performance and code portability. Performance evaluation results clearly demonstrate that, with a moderate programming effort, the proposed framework can express the collaboration between a vector host and a vector engine so as to make a good use of both of the two different processors. By delegating the right task to the right processor, an OpenCL-like program can fully exploit the performance of SX-Aurora.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT46702.2019.00059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
This paper presents an OpenCL-like offload programming framework for NEC SX-Aurora TSUBASA (SXAurora). Unlike traditional vector systems, one node of an SXAurora system consists of a host processor and some vector processors on PCI-Express cards, which are called a vector host and vector engines, respectively. Since the standard OpenCL execution model does not naturally fit in the vector engine, this paper discusses how to adapt the OpenCL specification to SXAurora while considering the trade off between performance and code portability. Performance evaluation results clearly demonstrate that, with a moderate programming effort, the proposed framework can express the collaboration between a vector host and a vector engine so as to make a good use of both of the two different processors. By delegating the right task to the right processor, an OpenCL-like program can fully exploit the performance of SX-Aurora.