{"title":"多核cpu和gpu上Simulink模型基于模型的并行化","authors":"Zhaoqian Zhong, M. Edahiro","doi":"10.1109/ISOCC47750.2019.9078489","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a model-based approach to parallelize Simulink models on multicore CPUs and NVIDIA GPUs at the block level and generate CUDA C codes for parallel execution. In our proposed approach, the Simulink models are converted to directed acyclic graphs (DAGs) based on their block diagrams, wherein the nodes represent tasks of grouped blocks in the model and the edges represent the communication behaviors between blocks. Next, a path analysis is conducted on the DAGs to extract all execution paths and calculate the length of each path, which comprises the execution times of tasks and the communication times of edges on the path. Then, an integer linear programming (ILP) formulation is used to minimize the length of the critical path of the DAG, which represents the execution time of the Simulink model. The ILP formulation also balances the workloads on each CPU core for optimized hardware utilization. We evaluate the proposed approach by parallelizing an image processing model on a platform of two homogeneous CPU cores and two GPUs to determine its effectiveness.","PeriodicalId":113802,"journal":{"name":"2019 International SoC Design Conference (ISOCC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Model-based Parallelization for Simulink Models on Multicore CPUs and GPUs\",\"authors\":\"Zhaoqian Zhong, M. Edahiro\",\"doi\":\"10.1109/ISOCC47750.2019.9078489\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a model-based approach to parallelize Simulink models on multicore CPUs and NVIDIA GPUs at the block level and generate CUDA C codes for parallel execution. In our proposed approach, the Simulink models are converted to directed acyclic graphs (DAGs) based on their block diagrams, wherein the nodes represent tasks of grouped blocks in the model and the edges represent the communication behaviors between blocks. Next, a path analysis is conducted on the DAGs to extract all execution paths and calculate the length of each path, which comprises the execution times of tasks and the communication times of edges on the path. Then, an integer linear programming (ILP) formulation is used to minimize the length of the critical path of the DAG, which represents the execution time of the Simulink model. The ILP formulation also balances the workloads on each CPU core for optimized hardware utilization. We evaluate the proposed approach by parallelizing an image processing model on a platform of two homogeneous CPU cores and two GPUs to determine its effectiveness.\",\"PeriodicalId\":113802,\"journal\":{\"name\":\"2019 International SoC Design Conference (ISOCC)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International SoC Design Conference (ISOCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISOCC47750.2019.9078489\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International SoC Design Conference (ISOCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISOCC47750.2019.9078489","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Model-based Parallelization for Simulink Models on Multicore CPUs and GPUs
In this paper, we propose a model-based approach to parallelize Simulink models on multicore CPUs and NVIDIA GPUs at the block level and generate CUDA C codes for parallel execution. In our proposed approach, the Simulink models are converted to directed acyclic graphs (DAGs) based on their block diagrams, wherein the nodes represent tasks of grouped blocks in the model and the edges represent the communication behaviors between blocks. Next, a path analysis is conducted on the DAGs to extract all execution paths and calculate the length of each path, which comprises the execution times of tasks and the communication times of edges on the path. Then, an integer linear programming (ILP) formulation is used to minimize the length of the critical path of the DAG, which represents the execution time of the Simulink model. The ILP formulation also balances the workloads on each CPU core for optimized hardware utilization. We evaluate the proposed approach by parallelizing an image processing model on a platform of two homogeneous CPU cores and two GPUs to determine its effectiveness.