{"title":"mpsoc混合并行计算的自动设计流程(仅摘要)","authors":"Hongyuan Ding, Miaoqing Huang","doi":"10.1145/2684746.2689141","DOIUrl":null,"url":null,"abstract":"State-of-the-art high-level synthesis (HLS) tools are able to lower the threshold for designers to exploit performance benefits of hardware accelerators. However, it is still a challenge to achieve parallelism on a hybrid multiprocessor system-on-chip (MPSoC). In this work, we present an automatic hybrid design flow. The hybrid hardware platform as well as both the hardware and software kernels can be generated through this flow. In addition, a hybrid OpenCL-like programming model is proposed to combine software and hardware kernels running on the unified hardware platform. Our results show that our automatic design flow can not only significantly minimize the development time, but also gain about 11 times speedup compared with pure software parallel implementation for a matrix multiplication benchmark.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An Automatic Design Flow for Hybrid Parallel Computing on MPSoCs (Abstract Only)\",\"authors\":\"Hongyuan Ding, Miaoqing Huang\",\"doi\":\"10.1145/2684746.2689141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"State-of-the-art high-level synthesis (HLS) tools are able to lower the threshold for designers to exploit performance benefits of hardware accelerators. However, it is still a challenge to achieve parallelism on a hybrid multiprocessor system-on-chip (MPSoC). In this work, we present an automatic hybrid design flow. The hybrid hardware platform as well as both the hardware and software kernels can be generated through this flow. In addition, a hybrid OpenCL-like programming model is proposed to combine software and hardware kernels running on the unified hardware platform. Our results show that our automatic design flow can not only significantly minimize the development time, but also gain about 11 times speedup compared with pure software parallel implementation for a matrix multiplication benchmark.\",\"PeriodicalId\":388546,\"journal\":{\"name\":\"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2684746.2689141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2684746.2689141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Automatic Design Flow for Hybrid Parallel Computing on MPSoCs (Abstract Only)
State-of-the-art high-level synthesis (HLS) tools are able to lower the threshold for designers to exploit performance benefits of hardware accelerators. However, it is still a challenge to achieve parallelism on a hybrid multiprocessor system-on-chip (MPSoC). In this work, we present an automatic hybrid design flow. The hybrid hardware platform as well as both the hardware and software kernels can be generated through this flow. In addition, a hybrid OpenCL-like programming model is proposed to combine software and hardware kernels running on the unified hardware platform. Our results show that our automatic design flow can not only significantly minimize the development time, but also gain about 11 times speedup compared with pure software parallel implementation for a matrix multiplication benchmark.