Mehdi Moghaddamfar, Christian Färber, Wolfgang Lehner, Norman May
{"title":"FPGA 上排序合并基元的 OpenCL 和 RTL 比较分析","authors":"Mehdi Moghaddamfar, Christian Färber, Wolfgang Lehner, Norman May","doi":"10.1145/3399666.3399897","DOIUrl":null,"url":null,"abstract":"As a result of recent improvements in FPGA technology, their benefits for highly efficient data processing pipelines are becoming more and more apparent. However, traditional RTL methods for programming FPGAs require knowledge of digital design and hardware description languages. OpenCL™ provides software developers with a C-based platform for implementing their applications without deep knowledge of digital design. In this paper, we conduct a comparative analysis of OpenCL and RTL-based implementations of a novel heapsort with merging sorted runs. In particular, we quantitatively compare their performance, FPGA resource utilization, and development effort. Our results show that while requiring comparable development effort, RTL implementations of critical primitives used in the algorithm achieve 4X better performance while using half as much the FPGA resources.","PeriodicalId":256784,"journal":{"name":"Proceedings of the 16th International Workshop on Data Management on New Hardware","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Comparative analysis of OpenCL and RTL for sort-merge primitives on FPGA\",\"authors\":\"Mehdi Moghaddamfar, Christian Färber, Wolfgang Lehner, Norman May\",\"doi\":\"10.1145/3399666.3399897\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a result of recent improvements in FPGA technology, their benefits for highly efficient data processing pipelines are becoming more and more apparent. However, traditional RTL methods for programming FPGAs require knowledge of digital design and hardware description languages. OpenCL™ provides software developers with a C-based platform for implementing their applications without deep knowledge of digital design. In this paper, we conduct a comparative analysis of OpenCL and RTL-based implementations of a novel heapsort with merging sorted runs. In particular, we quantitatively compare their performance, FPGA resource utilization, and development effort. Our results show that while requiring comparable development effort, RTL implementations of critical primitives used in the algorithm achieve 4X better performance while using half as much the FPGA resources.\",\"PeriodicalId\":256784,\"journal\":{\"name\":\"Proceedings of the 16th International Workshop on Data Management on New Hardware\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 16th International Workshop on Data Management on New Hardware\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3399666.3399897\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th International Workshop on Data Management on New Hardware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3399666.3399897","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparative analysis of OpenCL and RTL for sort-merge primitives on FPGA
As a result of recent improvements in FPGA technology, their benefits for highly efficient data processing pipelines are becoming more and more apparent. However, traditional RTL methods for programming FPGAs require knowledge of digital design and hardware description languages. OpenCL™ provides software developers with a C-based platform for implementing their applications without deep knowledge of digital design. In this paper, we conduct a comparative analysis of OpenCL and RTL-based implementations of a novel heapsort with merging sorted runs. In particular, we quantitatively compare their performance, FPGA resource utilization, and development effort. Our results show that while requiring comparable development effort, RTL implementations of critical primitives used in the algorithm achieve 4X better performance while using half as much the FPGA resources.