基于 2D 网格 NoC 的 DNN 加速器的神经元分组和映射方法

IF 3.4 3区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Journal of Parallel and Distributed Computing Pub Date : 2024-07-02 DOI:10.1016/j.jpdc.2024.104949

Furkan Nacar , Alperen Cakin , Selma Dilek , Suleyman Tosun , Krishnendu Chakrabarty

{"title":"基于 2D 网格 NoC 的 DNN 加速器的神经元分组和映射方法","authors":"Furkan Nacar , Alperen Cakin , Selma Dilek , Suleyman Tosun , Krishnendu Chakrabarty","doi":"10.1016/j.jpdc.2024.104949","DOIUrl":null,"url":null,"abstract":"<div><p>Deep Neural Networks (DNNs) have gained widespread adoption in various fields; however, their computational cost is often prohibitively high due to the large number of layers and neurons communicating with each other. Furthermore, DNNs can consume a significant amount of energy due to the large volume of data movement and computation they require. To address these challenges, there is a need for new architectures to accelerate DNNs. In this paper, we propose novel neuron grouping and mapping methods for 2D-mesh Network-on-Chip (NoC)-based DNN accelerators considering both fully connected and partially connected DNN models. We present Integer Linear Programming (ILP) and simulated annealing (SA)-based neuron grouping solutions with the objective of minimizing the total volume of data communication among the neuron groups. After determining a suitable graph representation of the DNN, we also apply ILP and SA methods to map the neurons onto a 2D-mesh NoC fabric with the objective of minimizing the total communication cost of the system. We conducted several experiments on various benchmarks and DNN models with different pruning ratios and achieved an average of 40-50% improvement in communication cost.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"193 ","pages":"Article 104949"},"PeriodicalIF":3.4000,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Neuron grouping and mapping methods for 2D-mesh NoC-based DNN accelerators\",\"authors\":\"Furkan Nacar , Alperen Cakin , Selma Dilek , Suleyman Tosun , Krishnendu Chakrabarty\",\"doi\":\"10.1016/j.jpdc.2024.104949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Deep Neural Networks (DNNs) have gained widespread adoption in various fields; however, their computational cost is often prohibitively high due to the large number of layers and neurons communicating with each other. Furthermore, DNNs can consume a significant amount of energy due to the large volume of data movement and computation they require. To address these challenges, there is a need for new architectures to accelerate DNNs. In this paper, we propose novel neuron grouping and mapping methods for 2D-mesh Network-on-Chip (NoC)-based DNN accelerators considering both fully connected and partially connected DNN models. We present Integer Linear Programming (ILP) and simulated annealing (SA)-based neuron grouping solutions with the objective of minimizing the total volume of data communication among the neuron groups. After determining a suitable graph representation of the DNN, we also apply ILP and SA methods to map the neurons onto a 2D-mesh NoC fabric with the objective of minimizing the total communication cost of the system. We conducted several experiments on various benchmarks and DNN models with different pruning ratios and achieved an average of 40-50% improvement in communication cost.</p></div>\",\"PeriodicalId\":54775,\"journal\":{\"name\":\"Journal of Parallel and Distributed Computing\",\"volume\":\"193 \",\"pages\":\"Article 104949\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Parallel and Distributed Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0743731524001138\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Parallel and Distributed Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0743731524001138","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

深度神经网络（DNN）已在多个领域得到广泛应用；然而，由于需要大量的层和神经元相互通信，其计算成本往往高得令人望而却步。此外，由于需要大量的数据移动和计算，DNN 还会消耗大量能源。为了应对这些挑战，我们需要新的架构来加速 DNN。在本文中，我们针对基于二维网格芯片上网络（NoC）的 DNN 加速器提出了新颖的神经元分组和映射方法，同时考虑了全连接和部分连接 DNN 模型。我们提出了基于整数线性规划（ILP）和模拟退火（SA）的神经元分组解决方案，目标是最大限度地减少神经元组之间的数据通信总量。在确定合适的 DNN 图表示之后，我们还应用 ILP 和 SA 方法将神经元映射到二维网格 NoC 结构上，目的是最大限度地降低系统的总通信成本。我们在各种基准和 DNN 模型上采用不同的剪枝比率进行了多次实验，结果发现通信成本平均降低了 40-50%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Neuron grouping and mapping methods for 2D-mesh NoC-based DNN accelerators

Deep Neural Networks (DNNs) have gained widespread adoption in various fields; however, their computational cost is often prohibitively high due to the large number of layers and neurons communicating with each other. Furthermore, DNNs can consume a significant amount of energy due to the large volume of data movement and computation they require. To address these challenges, there is a need for new architectures to accelerate DNNs. In this paper, we propose novel neuron grouping and mapping methods for 2D-mesh Network-on-Chip (NoC)-based DNN accelerators considering both fully connected and partially connected DNN models. We present Integer Linear Programming (ILP) and simulated annealing (SA)-based neuron grouping solutions with the objective of minimizing the total volume of data communication among the neuron groups. After determining a suitable graph representation of the DNN, we also apply ILP and SA methods to map the neurons onto a 2D-mesh NoC fabric with the objective of minimizing the total communication cost of the system. We conducted several experiments on various benchmarks and DNN models with different pruning ratios and achieved an average of 40-50% improvement in communication cost.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Parallel and Distributed Computing 工程技术-计算机：理论方法

CiteScore

10.30

自引率

2.60%

发文量

172

审稿时长

12 months

期刊介绍： This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing. The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems.