Siddhisanket Raskar, Jose M Monsalve Diaz, T. Applencourt, Kalyan Kumaran, Guangrong Gao
{"title":"代码模型数据流软件流水线的实现","authors":"Siddhisanket Raskar, Jose M Monsalve Diaz, T. Applencourt, Kalyan Kumaran, Guangrong Gao","doi":"10.1145/3578244.3583734","DOIUrl":null,"url":null,"abstract":"Computer architectures have evolved from single core to chips with thousands of cores. Loop and instruction level parallelism techniques like software pipelining that are successful for single cores have limitations in the multi-core era. We extend the software pipelining technology beyond the limits of fine-grained, instruction-level parallelism. We accomplish this through dataflow software pipelining technology and its extension. Specifically, we present extensions to dataflow-based codelet model and its abstract machine to exploit pipelined parallelism across loops. We extend the runtime implementation of the codelet model with our proposed extensions to take advantage of dataflow software pipelining principles using efficient single-owner fifo buffer across Codelet's dependencies. We show promising improvements with the use of dataflow software pipelining techniques by performing an in-depth case study of Cannon's algorithm for matrix multiplication.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Implementation of Dataflow Software Pipelining for Codelet Model\",\"authors\":\"Siddhisanket Raskar, Jose M Monsalve Diaz, T. Applencourt, Kalyan Kumaran, Guangrong Gao\",\"doi\":\"10.1145/3578244.3583734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computer architectures have evolved from single core to chips with thousands of cores. Loop and instruction level parallelism techniques like software pipelining that are successful for single cores have limitations in the multi-core era. We extend the software pipelining technology beyond the limits of fine-grained, instruction-level parallelism. We accomplish this through dataflow software pipelining technology and its extension. Specifically, we present extensions to dataflow-based codelet model and its abstract machine to exploit pipelined parallelism across loops. We extend the runtime implementation of the codelet model with our proposed extensions to take advantage of dataflow software pipelining principles using efficient single-owner fifo buffer across Codelet's dependencies. We show promising improvements with the use of dataflow software pipelining techniques by performing an in-depth case study of Cannon's algorithm for matrix multiplication.\",\"PeriodicalId\":160204,\"journal\":{\"name\":\"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3578244.3583734\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3578244.3583734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Implementation of Dataflow Software Pipelining for Codelet Model
Computer architectures have evolved from single core to chips with thousands of cores. Loop and instruction level parallelism techniques like software pipelining that are successful for single cores have limitations in the multi-core era. We extend the software pipelining technology beyond the limits of fine-grained, instruction-level parallelism. We accomplish this through dataflow software pipelining technology and its extension. Specifically, we present extensions to dataflow-based codelet model and its abstract machine to exploit pipelined parallelism across loops. We extend the runtime implementation of the codelet model with our proposed extensions to take advantage of dataflow software pipelining principles using efficient single-owner fifo buffer across Codelet's dependencies. We show promising improvements with the use of dataflow software pipelining techniques by performing an in-depth case study of Cannon's algorithm for matrix multiplication.