{"title":"动态安排的高级综合","authors":"Lana Josipović, Radhika Ghosal, P. Ienne","doi":"10.1145/3174243.3174264","DOIUrl":null,"url":null,"abstract":"High-level synthesis (HLS) tools almost universally generate statically scheduled datapaths. Static scheduling implies that circuits out of HLS tools have a hard time exploiting parallelism in code with potential memory dependencies, with control-dependent dependencies in inner loops, or where performance is limited by long latency control decisions. The situation is essentially the same as in computer architecture between Very-Long Instruction Word (VLIW) processors and dynamically scheduled superscalar processors; the former display the best performance per cost in highly regular embedded applications, but general purpose, irregular, and control-dominated computing tasks require the runtime flexibility of dynamic scheduling. In this work, we show that high-level synthesis of dynamically scheduled circuits is perfectly feasible by describing the implementation of a prototype synthesizer which generates a particular form of latency-insensitive synchronous circuits. Compared to a commercial HLS tool, the result is a different trade-off between performance and circuit complexity, much as superscalar processors represent a different trade-off compared to VLIW processors: in demanding applications, the performance is very significantly improved at an affordable cost. We here demonstrate only the first steps towards more performant high-level synthesis tools adapted to emerging FPGA applications and the demands of computing in broader application domains.","PeriodicalId":164936,"journal":{"name":"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"72","resultStr":"{\"title\":\"Dynamically Scheduled High-level Synthesis\",\"authors\":\"Lana Josipović, Radhika Ghosal, P. Ienne\",\"doi\":\"10.1145/3174243.3174264\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-level synthesis (HLS) tools almost universally generate statically scheduled datapaths. Static scheduling implies that circuits out of HLS tools have a hard time exploiting parallelism in code with potential memory dependencies, with control-dependent dependencies in inner loops, or where performance is limited by long latency control decisions. The situation is essentially the same as in computer architecture between Very-Long Instruction Word (VLIW) processors and dynamically scheduled superscalar processors; the former display the best performance per cost in highly regular embedded applications, but general purpose, irregular, and control-dominated computing tasks require the runtime flexibility of dynamic scheduling. In this work, we show that high-level synthesis of dynamically scheduled circuits is perfectly feasible by describing the implementation of a prototype synthesizer which generates a particular form of latency-insensitive synchronous circuits. Compared to a commercial HLS tool, the result is a different trade-off between performance and circuit complexity, much as superscalar processors represent a different trade-off compared to VLIW processors: in demanding applications, the performance is very significantly improved at an affordable cost. We here demonstrate only the first steps towards more performant high-level synthesis tools adapted to emerging FPGA applications and the demands of computing in broader application domains.\",\"PeriodicalId\":164936,\"journal\":{\"name\":\"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays\",\"volume\":\"66 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"72\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3174243.3174264\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3174243.3174264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
High-level synthesis (HLS) tools almost universally generate statically scheduled datapaths. Static scheduling implies that circuits out of HLS tools have a hard time exploiting parallelism in code with potential memory dependencies, with control-dependent dependencies in inner loops, or where performance is limited by long latency control decisions. The situation is essentially the same as in computer architecture between Very-Long Instruction Word (VLIW) processors and dynamically scheduled superscalar processors; the former display the best performance per cost in highly regular embedded applications, but general purpose, irregular, and control-dominated computing tasks require the runtime flexibility of dynamic scheduling. In this work, we show that high-level synthesis of dynamically scheduled circuits is perfectly feasible by describing the implementation of a prototype synthesizer which generates a particular form of latency-insensitive synchronous circuits. Compared to a commercial HLS tool, the result is a different trade-off between performance and circuit complexity, much as superscalar processors represent a different trade-off compared to VLIW processors: in demanding applications, the performance is very significantly improved at an affordable cost. We here demonstrate only the first steps towards more performant high-level synthesis tools adapted to emerging FPGA applications and the demands of computing in broader application domains.