{"title":"基于中间语言的FPGA-GPU-CPU协同调度","authors":"Na Hu, Chao Wang, Xuehai Zhou, Xi Li","doi":"10.1109/CODES-ISSS55005.2022.00008","DOIUrl":null,"url":null,"abstract":"FPGA-GPU-CPU collaboration compromise high performance and low cost in modern computing systems. However, the large mapping space between modules and heterogeneous processors brings complexity to the scheduling algorithm. This paper proposes a uniform-pipeline-based real-time oriented scheduling algorithm and a servant execution-flow model (SEFM) optimized for this scheduler. SEFM at runtime generates the target code from the intermediate language (IL) and scheduler-controlled parameters. The algorithms such as contrast stretching, etc., are accelerated by 1.4-2.7×, 1.9-3.8×, 2.7-10.5× respectively on CPU, GPU, and FPGA over OpenCV baseline. A case study of 3D waveform oscilloscope using scheduling solution on collaborated processors achieves 1.5× resource utilization than the pure FPGA.","PeriodicalId":129167,"journal":{"name":"2022 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Work-in-Progress: Scheduler for Collaborated FPGA-GPU-CPU Based on Intermediate Language\",\"authors\":\"Na Hu, Chao Wang, Xuehai Zhou, Xi Li\",\"doi\":\"10.1109/CODES-ISSS55005.2022.00008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"FPGA-GPU-CPU collaboration compromise high performance and low cost in modern computing systems. However, the large mapping space between modules and heterogeneous processors brings complexity to the scheduling algorithm. This paper proposes a uniform-pipeline-based real-time oriented scheduling algorithm and a servant execution-flow model (SEFM) optimized for this scheduler. SEFM at runtime generates the target code from the intermediate language (IL) and scheduler-controlled parameters. The algorithms such as contrast stretching, etc., are accelerated by 1.4-2.7×, 1.9-3.8×, 2.7-10.5× respectively on CPU, GPU, and FPGA over OpenCV baseline. A case study of 3D waveform oscilloscope using scheduling solution on collaborated processors achieves 1.5× resource utilization than the pure FPGA.\",\"PeriodicalId\":129167,\"journal\":{\"name\":\"2022 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)\",\"volume\":\"71 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CODES-ISSS55005.2022.00008\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CODES-ISSS55005.2022.00008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Work-in-Progress: Scheduler for Collaborated FPGA-GPU-CPU Based on Intermediate Language
FPGA-GPU-CPU collaboration compromise high performance and low cost in modern computing systems. However, the large mapping space between modules and heterogeneous processors brings complexity to the scheduling algorithm. This paper proposes a uniform-pipeline-based real-time oriented scheduling algorithm and a servant execution-flow model (SEFM) optimized for this scheduler. SEFM at runtime generates the target code from the intermediate language (IL) and scheduler-controlled parameters. The algorithms such as contrast stretching, etc., are accelerated by 1.4-2.7×, 1.9-3.8×, 2.7-10.5× respectively on CPU, GPU, and FPGA over OpenCV baseline. A case study of 3D waveform oscilloscope using scheduling solution on collaborated processors achieves 1.5× resource utilization than the pure FPGA.