基于强化学习的NoC应用映射优化

ACM Transactions on Design Automation of Electronic Systems (TODAES) Pub Date : 2022-02-18 DOI:10.1145/3510381

S. Jagadheesh, P. V. Bhanu, S. J

{"title":"基于强化学习的NoC应用映射优化","authors":"S. Jagadheesh, P. V. Bhanu, S. J","doi":"10.1145/3510381","DOIUrl":null,"url":null,"abstract":"Application mapping is one of the early stage design processes aimed to improve the performance of Network-on-Chip. Mapping is an NP-hard problem. A massive amount of high-quality supervised data is required to solve the application mapping problem using traditional neural networks. In this article, a reinforcement learning–based neural framework is proposed to learn the heuristics of the application mapping problem. The proposed reinforcement learning–based mapping algorithm (RL-MAP) has actor and critic networks. The actor is a policy network, which provides mapping sequences. The critic network estimates the communication cost of these mapping sequences. The actor network updates the policy distribution in the direction suggested by the critic. The proposed RL-MAP is trained with unsupervised data to predict the permutations of the cores to minimize the overall communication cost. Further, the solutions are improved using the 2-opt local search algorithm. The performance of RL-MAP is compared with a few well-known heuristic algorithms, the Neural Mapping Algorithm (NMA) and message-passing neural network-pointer network-based genetic algorithm (MPN-GA). Results show that the communication cost and runtime of the RL-MAP improved considerably in comparison with the heuristic algorithms. The communication cost of the solutions generated by RL-MAP is nearly equal to MPN-GA and improved by 4.2% over NMA, while consuming less runtime.","PeriodicalId":6933,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems (TODAES)","volume":"27 1","pages":"1 - 16"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"NoC Application Mapping Optimization Using Reinforcement Learning\",\"authors\":\"S. Jagadheesh, P. V. Bhanu, S. J\",\"doi\":\"10.1145/3510381\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Application mapping is one of the early stage design processes aimed to improve the performance of Network-on-Chip. Mapping is an NP-hard problem. A massive amount of high-quality supervised data is required to solve the application mapping problem using traditional neural networks. In this article, a reinforcement learning–based neural framework is proposed to learn the heuristics of the application mapping problem. The proposed reinforcement learning–based mapping algorithm (RL-MAP) has actor and critic networks. The actor is a policy network, which provides mapping sequences. The critic network estimates the communication cost of these mapping sequences. The actor network updates the policy distribution in the direction suggested by the critic. The proposed RL-MAP is trained with unsupervised data to predict the permutations of the cores to minimize the overall communication cost. Further, the solutions are improved using the 2-opt local search algorithm. The performance of RL-MAP is compared with a few well-known heuristic algorithms, the Neural Mapping Algorithm (NMA) and message-passing neural network-pointer network-based genetic algorithm (MPN-GA). Results show that the communication cost and runtime of the RL-MAP improved considerably in comparison with the heuristic algorithms. The communication cost of the solutions generated by RL-MAP is nearly equal to MPN-GA and improved by 4.2% over NMA, while consuming less runtime.\",\"PeriodicalId\":6933,\"journal\":{\"name\":\"ACM Transactions on Design Automation of Electronic Systems (TODAES)\",\"volume\":\"27 1\",\"pages\":\"1 - 16\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Design Automation of Electronic Systems (TODAES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3510381\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Design Automation of Electronic Systems (TODAES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510381","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

应用程序映射是旨在提高片上网络性能的早期设计过程之一。映射是np困难问题。传统的神经网络需要大量高质量的监督数据来解决应用映射问题。本文提出了一种基于强化学习的神经网络框架来学习应用映射问题的启发式算法。本文提出的基于强化学习的映射算法(RL-MAP)具有行动者网络和批评家网络。参与者是一个策略网络，它提供映射序列。批评家网络估计这些映射序列的通信成本。演员网络按照评论家建议的方向更新策略分布。提出的RL-MAP使用无监督数据进行训练，以预测核心的排列，以最小化总体通信成本。进一步，利用2-opt局部搜索算法对解进行了改进。将RL-MAP的性能与几种著名的启发式算法，神经映射算法(NMA)和基于消息传递神经网络指针网络的遗传算法(MPN-GA)进行了比较。结果表明，与启发式算法相比，RL-MAP的通信成本和运行时间有了很大的提高。RL-MAP生成的解决方案的通信成本几乎等于MPN-GA，比NMA提高4.2%，同时消耗更少的运行时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

NoC Application Mapping Optimization Using Reinforcement Learning

Application mapping is one of the early stage design processes aimed to improve the performance of Network-on-Chip. Mapping is an NP-hard problem. A massive amount of high-quality supervised data is required to solve the application mapping problem using traditional neural networks. In this article, a reinforcement learning–based neural framework is proposed to learn the heuristics of the application mapping problem. The proposed reinforcement learning–based mapping algorithm (RL-MAP) has actor and critic networks. The actor is a policy network, which provides mapping sequences. The critic network estimates the communication cost of these mapping sequences. The actor network updates the policy distribution in the direction suggested by the critic. The proposed RL-MAP is trained with unsupervised data to predict the permutations of the cores to minimize the overall communication cost. Further, the solutions are improved using the 2-opt local search algorithm. The performance of RL-MAP is compared with a few well-known heuristic algorithms, the Neural Mapping Algorithm (NMA) and message-passing neural network-pointer network-based genetic algorithm (MPN-GA). Results show that the communication cost and runtime of the RL-MAP improved considerably in comparison with the heuristic algorithms. The communication cost of the solutions generated by RL-MAP is nearly equal to MPN-GA and improved by 4.2% over NMA, while consuming less runtime.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Design Automation of Electronic Systems (TODAES)

自引率

0.00%

发文量