使用拉格朗日乘法器方法和遗传算法优化晶闸管横条映射以减少横条面积和延迟时间

Information Pub Date : 2024-07-15 DOI:10.3390/info15070409

Seungmyeong Cho, Rina Yoon, Ilpyeong Yoon, Jihwan Moon, Seokjin Oh, Kyeong-Sik Min

{"title":"使用拉格朗日乘法器方法和遗传算法优化晶闸管横条映射以减少横条面积和延迟时间","authors":"Seungmyeong Cho, Rina Yoon, Ilpyeong Yoon, Jihwan Moon, Seokjin Oh, Kyeong-Sik Min","doi":"10.3390/info15070409","DOIUrl":null,"url":null,"abstract":"Memristor crossbars offer promising low-power and parallel processing capabilities, making them efficient for implementing convolutional neural networks (CNNs) in terms of delay time, area, etc. However, mapping large CNN models like ResNet-18, ResNet-34, VGG-Net, etc., onto memristor crossbars is challenging due to the line resistance problem limiting crossbar size. This necessitates partitioning full-image convolution into sub-image convolution. To do so, an optimized mapping of memristor crossbars should be considered to divide full-image convolution into multiple crossbars. With limited crossbar resources, especially in edge devices, it is crucial to optimize the crossbar allocation per layer to minimize the hardware resource in term of crossbar area, delay time, and area–delay product. This paper explores three optimization scenarios: (1) optimizing total delay time under a crossbar’s area constraint, (2) optimizing total crossbar area with a crossbar’s delay time constraint, and (3) optimizing a crossbar’s area–delay-time product without constraints. The Lagrange multiplier method is employed for the constrained cases 1 and 2. For the unconstrained case 3, a genetic algorithm (GA) is used to optimize the area–delay-time product. Simulation results demonstrate that the optimization can have significant improvements over the unoptimized results. When VGG-Net is simulated, the optimization can show about 20% reduction in delay time for case 1 and 22% area reduction for case 2. Case 3 highlights the benefits of optimizing the crossbar utilization ratio for minimizing the area–delay-time product. The proposed optimization strategies can substantially enhance the neural network’s performance of memristor crossbar-based processing-in-memory architectures, especially for resource-constrained edge computing platforms.","PeriodicalId":510156,"journal":{"name":"Information","volume":"48 43","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimization of Memristor Crossbar’s Mapping Using Lagrange Multiplier Method and Genetic Algorithm for Reducing Crossbar’s Area and Delay Time\",\"authors\":\"Seungmyeong Cho, Rina Yoon, Ilpyeong Yoon, Jihwan Moon, Seokjin Oh, Kyeong-Sik Min\",\"doi\":\"10.3390/info15070409\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Memristor crossbars offer promising low-power and parallel processing capabilities, making them efficient for implementing convolutional neural networks (CNNs) in terms of delay time, area, etc. However, mapping large CNN models like ResNet-18, ResNet-34, VGG-Net, etc., onto memristor crossbars is challenging due to the line resistance problem limiting crossbar size. This necessitates partitioning full-image convolution into sub-image convolution. To do so, an optimized mapping of memristor crossbars should be considered to divide full-image convolution into multiple crossbars. With limited crossbar resources, especially in edge devices, it is crucial to optimize the crossbar allocation per layer to minimize the hardware resource in term of crossbar area, delay time, and area–delay product. This paper explores three optimization scenarios: (1) optimizing total delay time under a crossbar’s area constraint, (2) optimizing total crossbar area with a crossbar’s delay time constraint, and (3) optimizing a crossbar’s area–delay-time product without constraints. The Lagrange multiplier method is employed for the constrained cases 1 and 2. For the unconstrained case 3, a genetic algorithm (GA) is used to optimize the area–delay-time product. Simulation results demonstrate that the optimization can have significant improvements over the unoptimized results. When VGG-Net is simulated, the optimization can show about 20% reduction in delay time for case 1 and 22% area reduction for case 2. Case 3 highlights the benefits of optimizing the crossbar utilization ratio for minimizing the area–delay-time product. The proposed optimization strategies can substantially enhance the neural network’s performance of memristor crossbar-based processing-in-memory architectures, especially for resource-constrained edge computing platforms.\",\"PeriodicalId\":510156,\"journal\":{\"name\":\"Information\",\"volume\":\"48 43\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/info15070409\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info15070409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

忆阻器横梁具有前景广阔的低功耗和并行处理能力，使其在延迟时间、面积等方面能够高效地实现卷积神经网络（CNN）。然而，将 ResNet-18、ResNet-34、VGG-Net 等大型 CNN 模型映射到忆阻器横梁上具有挑战性，因为线路电阻问题限制了横梁的尺寸。这就需要将全图像卷积划分为子图像卷积。为此，应考虑优化忆阻器横梁的映射，将全图卷积分为多个横梁。由于横条资源有限，特别是在边缘器件中，因此优化每层的横条分配以最大限度地减少横条面积、延迟时间和面积-延迟乘积等硬件资源至关重要。本文探讨了三种优化方案：(1) 在横梁面积限制条件下优化总延迟时间；(2) 在横梁延迟时间限制条件下优化横梁总面积；(3) 在无限制条件下优化横梁面积-延迟时间乘积。拉格朗日乘法适用于有约束条件的情况 1 和 2。对于无约束情况 3，则采用遗传算法（GA）来优化面积-延迟时间乘积。仿真结果表明，优化结果比未优化结果有显著改善。在对 VGG-Net 进行仿真时，对案例 1 进行优化后，延迟时间减少了约 20%，对案例 2 进行优化后，面积减少了 22%。案例 3 突出了优化横梁利用率对最小化面积-延迟时间乘积的益处。所提出的优化策略可大幅提高基于忆阻器横梁的内存处理架构的神经网络性能，尤其适用于资源受限的边缘计算平台。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimization of Memristor Crossbar’s Mapping Using Lagrange Multiplier Method and Genetic Algorithm for Reducing Crossbar’s Area and Delay Time

Memristor crossbars offer promising low-power and parallel processing capabilities, making them efficient for implementing convolutional neural networks (CNNs) in terms of delay time, area, etc. However, mapping large CNN models like ResNet-18, ResNet-34, VGG-Net, etc., onto memristor crossbars is challenging due to the line resistance problem limiting crossbar size. This necessitates partitioning full-image convolution into sub-image convolution. To do so, an optimized mapping of memristor crossbars should be considered to divide full-image convolution into multiple crossbars. With limited crossbar resources, especially in edge devices, it is crucial to optimize the crossbar allocation per layer to minimize the hardware resource in term of crossbar area, delay time, and area–delay product. This paper explores three optimization scenarios: (1) optimizing total delay time under a crossbar’s area constraint, (2) optimizing total crossbar area with a crossbar’s delay time constraint, and (3) optimizing a crossbar’s area–delay-time product without constraints. The Lagrange multiplier method is employed for the constrained cases 1 and 2. For the unconstrained case 3, a genetic algorithm (GA) is used to optimize the area–delay-time product. Simulation results demonstrate that the optimization can have significant improvements over the unoptimized results. When VGG-Net is simulated, the optimization can show about 20% reduction in delay time for case 1 and 22% area reduction for case 2. Case 3 highlights the benefits of optimizing the crossbar utilization ratio for minimizing the area–delay-time product. The proposed optimization strategies can substantially enhance the neural network’s performance of memristor crossbar-based processing-in-memory architectures, especially for resource-constrained edge computing platforms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information

自引率

0.00%

发文量