Ning Li;Dejian Li;Zhipeng Wu;Peiguang Jing;Sio Hang Pun;Yu Liu
{"title":"SubMap:基于子CGRA探测的CGRA局部映射策略","authors":"Ning Li;Dejian Li;Zhipeng Wu;Peiguang Jing;Sio Hang Pun;Yu Liu","doi":"10.1109/TCAD.2024.3520024","DOIUrl":null,"url":null,"abstract":"Coarse-grained reconfigurable array (CGRA) is a quality hardware for compute-intensive loop kernels, with its excellent balance of performance, energy efficiency, and reconfigurability. However, the efficiency of CGRA depends heavily on how the compiler maps the data flow graph (DFG) extracted from application kernels onto the target architecture. Most existing CGRA compilers encounter the challenge of long compilation times due to excessive exploration space. To reduce the exploration space and compilation time, we propose SubMap, which adaptively explores a suitable sub-CGRA for different DFGs in a target CGRA and efficiently performs the mapping process. The experimental results show that SubMap greatly reduces the compilation time compared to the latest methods while maintaining the mapping quality. On HyCube <inline-formula> <tex-math>$4\\times 4$ </tex-math></inline-formula>, SubMap has an average performance improvement of <inline-formula> <tex-math>$9.47 \\times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$11.67 \\times $ </tex-math></inline-formula>, respectively, compared with Morpher (Pathfinder) and Morpher (SA). As the scale of the target CGRA increases, the performance improvement of SubMap becomes more pronounced.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 7","pages":"2827-2831"},"PeriodicalIF":2.9000,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SubMap: A Partial Mapping Strategy for CGRA Based on sub-CGRA Exploration\",\"authors\":\"Ning Li;Dejian Li;Zhipeng Wu;Peiguang Jing;Sio Hang Pun;Yu Liu\",\"doi\":\"10.1109/TCAD.2024.3520024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Coarse-grained reconfigurable array (CGRA) is a quality hardware for compute-intensive loop kernels, with its excellent balance of performance, energy efficiency, and reconfigurability. However, the efficiency of CGRA depends heavily on how the compiler maps the data flow graph (DFG) extracted from application kernels onto the target architecture. Most existing CGRA compilers encounter the challenge of long compilation times due to excessive exploration space. To reduce the exploration space and compilation time, we propose SubMap, which adaptively explores a suitable sub-CGRA for different DFGs in a target CGRA and efficiently performs the mapping process. The experimental results show that SubMap greatly reduces the compilation time compared to the latest methods while maintaining the mapping quality. On HyCube <inline-formula> <tex-math>$4\\\\times 4$ </tex-math></inline-formula>, SubMap has an average performance improvement of <inline-formula> <tex-math>$9.47 \\\\times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$11.67 \\\\times $ </tex-math></inline-formula>, respectively, compared with Morpher (Pathfinder) and Morpher (SA). As the scale of the target CGRA increases, the performance improvement of SubMap becomes more pronounced.\",\"PeriodicalId\":13251,\"journal\":{\"name\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"volume\":\"44 7\",\"pages\":\"2827-2831\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10806802/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10806802/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
SubMap: A Partial Mapping Strategy for CGRA Based on sub-CGRA Exploration
Coarse-grained reconfigurable array (CGRA) is a quality hardware for compute-intensive loop kernels, with its excellent balance of performance, energy efficiency, and reconfigurability. However, the efficiency of CGRA depends heavily on how the compiler maps the data flow graph (DFG) extracted from application kernels onto the target architecture. Most existing CGRA compilers encounter the challenge of long compilation times due to excessive exploration space. To reduce the exploration space and compilation time, we propose SubMap, which adaptively explores a suitable sub-CGRA for different DFGs in a target CGRA and efficiently performs the mapping process. The experimental results show that SubMap greatly reduces the compilation time compared to the latest methods while maintaining the mapping quality. On HyCube $4\times 4$ , SubMap has an average performance improvement of $9.47 \times $ and $11.67 \times $ , respectively, compared with Morpher (Pathfinder) and Morpher (SA). As the scale of the target CGRA increases, the performance improvement of SubMap becomes more pronounced.
期刊介绍:
The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.