{"title":"用DFC在昆仑小型计算机上实现矩阵乘法","authors":"Zheng Du, Jing Zhang, Shihao Sha, Qiuming Luo","doi":"10.1109/PDCAT46702.2019.00032","DOIUrl":null,"url":null,"abstract":"In this paper, we demonstrate a new dataflow platform of DFC, which can handle the successive dataflow computing passes with tagged data. By implementing the matrix multiplication in DFC, we show that DFC can exploit the parallelism automatically with a much simple dataflow graph constructed by DF functions of DFC. Different from the other dataflow execution platform, DFC support multiple worker threads for one dataflow node of DF functions. By running the matrix multiplication program of DFC on Kunlun system, it was verified that DFC get a reasonable speedup for large scale computing for thread number up to 512.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Implementing the Matrix Multiplication with DFC on Kunlun Small Scale Computer\",\"authors\":\"Zheng Du, Jing Zhang, Shihao Sha, Qiuming Luo\",\"doi\":\"10.1109/PDCAT46702.2019.00032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we demonstrate a new dataflow platform of DFC, which can handle the successive dataflow computing passes with tagged data. By implementing the matrix multiplication in DFC, we show that DFC can exploit the parallelism automatically with a much simple dataflow graph constructed by DF functions of DFC. Different from the other dataflow execution platform, DFC support multiple worker threads for one dataflow node of DF functions. By running the matrix multiplication program of DFC on Kunlun system, it was verified that DFC get a reasonable speedup for large scale computing for thread number up to 512.\",\"PeriodicalId\":166126,\"journal\":{\"name\":\"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT46702.2019.00032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT46702.2019.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Implementing the Matrix Multiplication with DFC on Kunlun Small Scale Computer
In this paper, we demonstrate a new dataflow platform of DFC, which can handle the successive dataflow computing passes with tagged data. By implementing the matrix multiplication in DFC, we show that DFC can exploit the parallelism automatically with a much simple dataflow graph constructed by DF functions of DFC. Different from the other dataflow execution platform, DFC support multiple worker threads for one dataflow node of DF functions. By running the matrix multiplication program of DFC on Kunlun system, it was verified that DFC get a reasonable speedup for large scale computing for thread number up to 512.