Accelerating the neural network controller embedded implementation on FPGA with novel dropout techniques for a solar inverter

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Pervasive and Mobile Computing Pub Date : 2024-08-17 DOI:10.1016/j.pmcj.2024.101975

Jordan Sturtz , Kushal Kalyan Devalampeta Surendranath , Maxwell Sam , Xingang Fu , Chanakya Dinesh Hingu , Rajab Challoo , Letu Qingge

{"title":"Accelerating the neural network controller embedded implementation on FPGA with novel dropout techniques for a solar inverter","authors":"Jordan Sturtz , Kushal Kalyan Devalampeta Surendranath , Maxwell Sam , Xingang Fu , Chanakya Dinesh Hingu , Rajab Challoo , Letu Qingge","doi":"10.1016/j.pmcj.2024.101975","DOIUrl":null,"url":null,"abstract":"<div><p>Accelerating neural network (NN) controllers is important for improving the performance, efficiency, scalability, and reliability of real-time systems, particularly in resource-constrained embedded systems. This paper introduces a novel weight-dropout method for training neural network controllers in real-time closed-loop systems, aimed at accelerating the embedded implementation for solar inverters. The core idea is to eliminate small-magnitude weights during training, thereby reducing the number of necessary connections while ensuring the network’s convergence. To maintain convergence, only non-diagonal elements of the weight matrices were dropped. This dropout technique was integrated into the Levenberg–Marquardt and Forward Accumulation Through Time algorithms, resulting in more efficient training for trajectory tracking. We executed the proposed training algorithm with dropout on the AWS cloud, observing a performance increase of approximately four times compared to local execution. Furthermore, implementing the neural network controller on the Intel Cyclone V Field Programmable Gate Array (FPGA) demonstrates significant improvements in computational and resource efficiency due to the proposed dropout technique leading to sparse weight matrices. This optimization enhances the suitability of the neural network controller for embedded environments. In comparison to Sturtz et al. (2023), which dropped 11 weights, our approach eliminated 18 weights, significantly boosting resource efficiency. This resulted in a 16.40% reduction in Adaptive Logic Modules (ALMs), decreasing the count to 47,426.5. Combinational Look-Up Tables (LUTs) and dedicated logic registers saw reductions of 17.80% and 15.55%, respectively. However, the impact on block memory bits is minimal, showing only a 1% improvement, indicating that memory resources are less affected by weight dropout. In contrast, the usage of Memory 10 Kilobits (MK10s) dropped from 97 to 87, marking a 10% improvement. We also propose an adaptive dropout technique to further improve the previous results.</p></div>","PeriodicalId":49005,"journal":{"name":"Pervasive and Mobile Computing","volume":"104 ","pages":"Article 101975"},"PeriodicalIF":3.0000,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pervasive and Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574119224001007","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Accelerating neural network (NN) controllers is important for improving the performance, efficiency, scalability, and reliability of real-time systems, particularly in resource-constrained embedded systems. This paper introduces a novel weight-dropout method for training neural network controllers in real-time closed-loop systems, aimed at accelerating the embedded implementation for solar inverters. The core idea is to eliminate small-magnitude weights during training, thereby reducing the number of necessary connections while ensuring the network’s convergence. To maintain convergence, only non-diagonal elements of the weight matrices were dropped. This dropout technique was integrated into the Levenberg–Marquardt and Forward Accumulation Through Time algorithms, resulting in more efficient training for trajectory tracking. We executed the proposed training algorithm with dropout on the AWS cloud, observing a performance increase of approximately four times compared to local execution. Furthermore, implementing the neural network controller on the Intel Cyclone V Field Programmable Gate Array (FPGA) demonstrates significant improvements in computational and resource efficiency due to the proposed dropout technique leading to sparse weight matrices. This optimization enhances the suitability of the neural network controller for embedded environments. In comparison to Sturtz et al. (2023), which dropped 11 weights, our approach eliminated 18 weights, significantly boosting resource efficiency. This resulted in a 16.40% reduction in Adaptive Logic Modules (ALMs), decreasing the count to 47,426.5. Combinational Look-Up Tables (LUTs) and dedicated logic registers saw reductions of 17.80% and 15.55%, respectively. However, the impact on block memory bits is minimal, showing only a 1% improvement, indicating that memory resources are less affected by weight dropout. In contrast, the usage of Memory 10 Kilobits (MK10s) dropped from 97 to 87, marking a 10% improvement. We also propose an adaptive dropout technique to further improve the previous results.

查看原文本刊更多论文

利用新型剔除技术加速太阳能逆变器神经网络控制器在 FPGA 上的嵌入式实现

加速神经网络（NN）控制器对于提高实时系统的性能、效率、可扩展性和可靠性非常重要，尤其是在资源受限的嵌入式系统中。本文介绍了一种在实时闭环系统中训练神经网络控制器的新型权重去除方法，旨在加速太阳能逆变器的嵌入式实施。其核心思想是在训练过程中剔除幅度较小的权重，从而减少必要的连接数，同时确保网络的收敛性。为了保持收敛性，只剔除权重矩阵中的非对角元素。这种丢弃技术被集成到了 Levenberg-Marquardt 算法和时间前向累积算法中，从而提高了轨迹跟踪训练的效率。我们在 AWS 云上执行了建议的训练算法，与本地执行相比，性能提高了约四倍。此外，在英特尔 Cyclone V 现场可编程门阵列（FPGA）上实施神经网络控制器时，由于采用了建议的 "剔除 "技术，权重矩阵变得稀疏，因此计算和资源效率有了显著提高。这种优化提高了神经网络控制器在嵌入式环境中的适用性。与 Sturtz 等人（2023 年）放弃 11 个权重相比，我们的方法取消了 18 个权重，大大提高了资源效率。这使得自适应逻辑模块（ALM）减少了 16.40%，数量降至 47,426.5 个。组合查找表（LUT）和专用逻辑寄存器分别减少了 17.80% 和 15.55%。不过，对块存储器位的影响微乎其微，仅提高了 1%，这表明存储器资源受权重下降的影响较小。相比之下，内存 10 千比特（MK10）的使用率从 97 降至 87，提高了 10%。我们还提出了一种自适应丢弃技术，以进一步改进之前的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pervasive and Mobile Computing COMPUTER SCIENCE, INFORMATION SYSTEMS-TELECOMMUNICATIONS

CiteScore

7.70

自引率

2.30%

发文量

审稿时长

68 days

期刊介绍： As envisioned by Mark Weiser as early as 1991, pervasive computing systems and services have truly become integral parts of our daily lives. Tremendous developments in a multitude of technologies ranging from personalized and embedded smart devices (e.g., smartphones, sensors, wearables, IoTs, etc.) to ubiquitous connectivity, via a variety of wireless mobile communications and cognitive networking infrastructures, to advanced computing techniques (including edge, fog and cloud) and user-friendly middleware services and platforms have significantly contributed to the unprecedented advances in pervasive and mobile computing. Cutting-edge applications and paradigms have evolved, such as cyber-physical systems and smart environments (e.g., smart city, smart energy, smart transportation, smart healthcare, etc.) that also involve human in the loop through social interactions and participatory and/or mobile crowd sensing, for example. The goal of pervasive computing systems is to improve human experience and quality of life, without explicit awareness of the underlying communications and computing technologies. The Pervasive and Mobile Computing Journal (PMC) is a high-impact, peer-reviewed technical journal that publishes high-quality scientific articles spanning theory and practice, and covering all aspects of pervasive and mobile computing and systems.