面向特定应用noc的高效设计空间探索的机器学习方法

ACM Transactions on Design Automation of Electronic Systems (TODAES) Pub Date : 2020-08-27 DOI:10.1145/3403584

YongTing Hu, Marcel Mettler, Daniel Mueller-Gritschneder, Thomas Wild, A. Herkersdorf, Ulf Schlichtmann

{"title":"面向特定应用noc的高效设计空间探索的机器学习方法","authors":"YongTing Hu, Marcel Mettler, Daniel Mueller-Gritschneder, Thomas Wild, A. Herkersdorf, Ulf Schlichtmann","doi":"10.1145/3403584","DOIUrl":null,"url":null,"abstract":"In many Multi-Processor Systems-on-Chip (MPSoCs), traffic between cores is unbalanced. This motivates the use of an application-specific Network-on-Chip (NoC) that is customized and can provide a high performance at low cost in terms of power and area. However, finding an optimized application-specific NoC architecture is a challenging task due to the huge design space. This article proposes to apply machine learning approaches for this task. Using graph rewriting, the NoC Design Space Exploration (DSE) is modelled as a Markov Decision Process (MDP). Monte Carlo Tree Search (MCTS), a technique from reinforcement learning, is used as search heuristic. Our experimental results show that—with the same cost function and exploration budget—MCTS finds superior NoC architectures compared to Simulated Annealing (SA) and a Genetic Algorithm (GA). However, the NoC DSE process suffers from the high computation time due to expensive cycle-accurate SystemC simulations for latency estimation. This article therefore additionally proposes to replace latency simulation by fast latency estimation using a Recurrent Neural Network (RNN). The designed RNN is sufficiently general for latency estimation on arbitrary NoC architectures. Our experiments show that compared to SystemC simulation, the RNN-based latency estimation offers a similar speed-up as the widely used Queuing Theory (QT). Yet, in terms of estimation accuracy and fidelity, the RNN is superior to QT, especially for high-traffic scenarios. When replacing SystemC simulations with the RNN estimation, the obtained solution quality decreases only slightly, whereas it suffers significantly when QT is used.","PeriodicalId":6933,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems (TODAES)","volume":"25 1","pages":"1 - 27"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Machine Learning Approaches for Efficient Design Space Exploration of Application-Specific NoCs\",\"authors\":\"YongTing Hu, Marcel Mettler, Daniel Mueller-Gritschneder, Thomas Wild, A. Herkersdorf, Ulf Schlichtmann\",\"doi\":\"10.1145/3403584\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In many Multi-Processor Systems-on-Chip (MPSoCs), traffic between cores is unbalanced. This motivates the use of an application-specific Network-on-Chip (NoC) that is customized and can provide a high performance at low cost in terms of power and area. However, finding an optimized application-specific NoC architecture is a challenging task due to the huge design space. This article proposes to apply machine learning approaches for this task. Using graph rewriting, the NoC Design Space Exploration (DSE) is modelled as a Markov Decision Process (MDP). Monte Carlo Tree Search (MCTS), a technique from reinforcement learning, is used as search heuristic. Our experimental results show that—with the same cost function and exploration budget—MCTS finds superior NoC architectures compared to Simulated Annealing (SA) and a Genetic Algorithm (GA). However, the NoC DSE process suffers from the high computation time due to expensive cycle-accurate SystemC simulations for latency estimation. This article therefore additionally proposes to replace latency simulation by fast latency estimation using a Recurrent Neural Network (RNN). The designed RNN is sufficiently general for latency estimation on arbitrary NoC architectures. Our experiments show that compared to SystemC simulation, the RNN-based latency estimation offers a similar speed-up as the widely used Queuing Theory (QT). Yet, in terms of estimation accuracy and fidelity, the RNN is superior to QT, especially for high-traffic scenarios. When replacing SystemC simulations with the RNN estimation, the obtained solution quality decreases only slightly, whereas it suffers significantly when QT is used.\",\"PeriodicalId\":6933,\"journal\":{\"name\":\"ACM Transactions on Design Automation of Electronic Systems (TODAES)\",\"volume\":\"25 1\",\"pages\":\"1 - 27\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Design Automation of Electronic Systems (TODAES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3403584\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Design Automation of Electronic Systems (TODAES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3403584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

在许多多处理器片上系统(mpsoc)中，内核之间的流量是不平衡的。这促使使用特定于应用程序的片上网络(NoC)，该网络是定制的，可以在功耗和面积方面以低成本提供高性能。然而，由于巨大的设计空间，找到优化的特定于应用程序的NoC架构是一项具有挑战性的任务。本文建议将机器学习方法应用于此任务。利用图形重写，将NoC设计空间探索(DSE)建模为马尔可夫决策过程(MDP)。蒙特卡罗树搜索(MCTS)是一种来自强化学习的搜索启发式算法。我们的实验结果表明，与模拟退火(SA)和遗传算法(GA)相比，在相同的成本函数和勘探预算下，mcts找到了更好的NoC架构。然而，由于昂贵的周期精确的SystemC模拟延迟估计，NoC DSE过程的计算时间很高。因此，本文还建议使用递归神经网络(RNN)的快速延迟估计来取代延迟模拟。所设计的RNN对于任意NoC体系结构的延迟估计具有足够的通用性。我们的实验表明，与SystemC仿真相比，基于rnn的延迟估计提供了与广泛使用的排队理论(QT)相似的加速。然而，在估计精度和保真度方面，RNN优于QT，特别是在高流量场景下。当用RNN估计代替SystemC模拟时，得到的解决方案质量只会略有下降，而当使用QT时，它会受到显著影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Machine Learning Approaches for Efficient Design Space Exploration of Application-Specific NoCs

In many Multi-Processor Systems-on-Chip (MPSoCs), traffic between cores is unbalanced. This motivates the use of an application-specific Network-on-Chip (NoC) that is customized and can provide a high performance at low cost in terms of power and area. However, finding an optimized application-specific NoC architecture is a challenging task due to the huge design space. This article proposes to apply machine learning approaches for this task. Using graph rewriting, the NoC Design Space Exploration (DSE) is modelled as a Markov Decision Process (MDP). Monte Carlo Tree Search (MCTS), a technique from reinforcement learning, is used as search heuristic. Our experimental results show that—with the same cost function and exploration budget—MCTS finds superior NoC architectures compared to Simulated Annealing (SA) and a Genetic Algorithm (GA). However, the NoC DSE process suffers from the high computation time due to expensive cycle-accurate SystemC simulations for latency estimation. This article therefore additionally proposes to replace latency simulation by fast latency estimation using a Recurrent Neural Network (RNN). The designed RNN is sufficiently general for latency estimation on arbitrary NoC architectures. Our experiments show that compared to SystemC simulation, the RNN-based latency estimation offers a similar speed-up as the widely used Queuing Theory (QT). Yet, in terms of estimation accuracy and fidelity, the RNN is superior to QT, especially for high-traffic scenarios. When replacing SystemC simulations with the RNN estimation, the obtained solution quality decreases only slightly, whereas it suffers significantly when QT is used.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Design Automation of Electronic Systems (TODAES)

自引率

0.00%

发文量