I. Aliyev, J. Mack, Nirmal Kumbhare, A. Akoglu, H. F. Ugurdag
{"title":"基于fpga的异构计算最小延迟HEFT调度程序","authors":"I. Aliyev, J. Mack, Nirmal Kumbhare, A. Akoglu, H. F. Ugurdag","doi":"10.1109/UBMK52708.2021.9558926","DOIUrl":null,"url":null,"abstract":"This paper proposes a new hardware scheduler. As heterogeneous computing becomes prevalent, mapping applications on to multiple processing elements (PEs) proves to be non-trivial. Heterogeneous Earliest Finish Time (HEFT) algorithm is an already existing scheduler that aims to minimize the total execution time of an application. The paradigm of HEFT is such that it accepts an acyclic task graph as input at run-time and assigns/schedules the precompiled atomic tasks to PEs. HEFT stands out among many such schedulers not only in terms of producing shorter schedules but also in terms of its own short execution time. However, in real-time applications, the lower the latency, the better it is. To the best of our knowledge, this work is the only work that implements HEFT in hardware (on FPGA) further lowering its latency from milliseconds to as much as less than a microsecond. Porting HEFT to hardware has been challenging as data dependencies limit the amount of parallelism. Design of an efficient memory access pattern as well as an “incremental sorter” were key enablers in reducing the latency of the hardware implementation. We also integrated our FPGA-HEFT into an ARM-based SoC and validated its functionality using a realistic workload.","PeriodicalId":106516,"journal":{"name":"2021 6th International Conference on Computer Science and Engineering (UBMK)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"FPGA-based Minimal Latency HEFT Scheduler for Heterogeneous Computing\",\"authors\":\"I. Aliyev, J. Mack, Nirmal Kumbhare, A. Akoglu, H. F. Ugurdag\",\"doi\":\"10.1109/UBMK52708.2021.9558926\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a new hardware scheduler. As heterogeneous computing becomes prevalent, mapping applications on to multiple processing elements (PEs) proves to be non-trivial. Heterogeneous Earliest Finish Time (HEFT) algorithm is an already existing scheduler that aims to minimize the total execution time of an application. The paradigm of HEFT is such that it accepts an acyclic task graph as input at run-time and assigns/schedules the precompiled atomic tasks to PEs. HEFT stands out among many such schedulers not only in terms of producing shorter schedules but also in terms of its own short execution time. However, in real-time applications, the lower the latency, the better it is. To the best of our knowledge, this work is the only work that implements HEFT in hardware (on FPGA) further lowering its latency from milliseconds to as much as less than a microsecond. Porting HEFT to hardware has been challenging as data dependencies limit the amount of parallelism. Design of an efficient memory access pattern as well as an “incremental sorter” were key enablers in reducing the latency of the hardware implementation. We also integrated our FPGA-HEFT into an ARM-based SoC and validated its functionality using a realistic workload.\",\"PeriodicalId\":106516,\"journal\":{\"name\":\"2021 6th International Conference on Computer Science and Engineering (UBMK)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 6th International Conference on Computer Science and Engineering (UBMK)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UBMK52708.2021.9558926\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 6th International Conference on Computer Science and Engineering (UBMK)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UBMK52708.2021.9558926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
FPGA-based Minimal Latency HEFT Scheduler for Heterogeneous Computing
This paper proposes a new hardware scheduler. As heterogeneous computing becomes prevalent, mapping applications on to multiple processing elements (PEs) proves to be non-trivial. Heterogeneous Earliest Finish Time (HEFT) algorithm is an already existing scheduler that aims to minimize the total execution time of an application. The paradigm of HEFT is such that it accepts an acyclic task graph as input at run-time and assigns/schedules the precompiled atomic tasks to PEs. HEFT stands out among many such schedulers not only in terms of producing shorter schedules but also in terms of its own short execution time. However, in real-time applications, the lower the latency, the better it is. To the best of our knowledge, this work is the only work that implements HEFT in hardware (on FPGA) further lowering its latency from milliseconds to as much as less than a microsecond. Porting HEFT to hardware has been challenging as data dependencies limit the amount of parallelism. Design of an efficient memory access pattern as well as an “incremental sorter” were key enablers in reducing the latency of the hardware implementation. We also integrated our FPGA-HEFT into an ARM-based SoC and validated its functionality using a realistic workload.