利用 DARE65T 库平台为空间机器学习应用设计高效的粗粒度可重构阵列架构

IF 1.9 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Luca Zulberti , Matteo Monopoli , Pietro Nannipieri , Silvia Moranti , Geert Thys , Luca Fanucci
{"title":"利用 DARE65T 库平台为空间机器学习应用设计高效的粗粒度可重构阵列架构","authors":"Luca Zulberti ,&nbsp;Matteo Monopoli ,&nbsp;Pietro Nannipieri ,&nbsp;Silvia Moranti ,&nbsp;Geert Thys ,&nbsp;Luca Fanucci","doi":"10.1016/j.micpro.2025.105142","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing use of satellites, rovers, and other space exploration devices, Artificial Intelligence (AI) is also becoming an important tool for space exploration, allowing autonomous decision-making and operations in harsh environments. As a result, there is an increasing demand for reliable and energy-efficient processing platforms in the space industry. Among all processing architectures, Coarse-Grained Reconfigurable Arrays (CGRAs) are becoming popular, particularly in data-intensive applications like machine learning, demonstrating a substantial improvement in the energy efficiency of inference operations while preserving a good degree of versatility. In high-level class space missions, the hardware platforms incorporate radiation-hardened Field Programmable Gate Arrays (FPGAs) and microcontrollers, which do not meet the performance requirements for the aforementioned AI applications. The use of CGRA architectures in space missions is still not widely studied. The main contribution of this work is a comprehensive Design Space Exploration (DSE) activity with our highly parameterized CGRA architecture, exploring the costs associated with various design parameters when targeting AI in the space domain. We evaluated performance, power consumption, and area occupation after synthesis on the radiation-hardened DARE65T standard cell library developed by imec, based on a commercial 65 nm technology process. We characterize different CGRA configurations, comparing them with state-of-the-art solutions used for the acceleration of the AI algorithms. This work highlights Performance, Power, and Area (PPA) results that range from <span><math><mrow><mi>100</mi><mspace></mspace><mi>MHz</mi></mrow></math></span> (up to <span><math><mrow><mi>600</mi><mspace></mspace><mi>MOps</mi></mrow></math></span>), <span><math><mrow><mi>2.43</mi><mo>×</mo><msup><mrow><mi>10</mi></mrow><mrow><mi>4</mi></mrow></msup><mspace></mspace><mstyle><mstyle><mi>μ</mi></mstyle></mstyle><msup><mrow><mi>m</mi></mrow><mrow><mi>2</mi></mrow></msup></mrow></math></span> cell area occupation and <span><math><mrow><mi>0.699</mi><mspace></mspace><mi>mW</mi></mrow></math></span> power consumption, to <span><math><mrow><mi>625</mi><mspace></mspace><mi>MHz</mi></mrow></math></span> (up to <span><math><mrow><mi>3.75</mi><mspace></mspace><mi>GOps</mi></mrow></math></span>), <span><math><mrow><mi>2.43</mi><mo>×</mo><msup><mrow><mi>10</mi></mrow><mrow><mi>5</mi></mrow></msup><mspace></mspace><mstyle><mstyle><mi>μ</mi></mstyle></mstyle><msup><mrow><mi>m</mi></mrow><mrow><mi>2</mi></mrow></msup><mo>,</mo><mi>46.5</mi><mspace></mspace><mi>mW</mi></mrow></math></span>. During DSE activity, we highlight the optimal solutions in terms of area efficiency (up to <span><math><mrow><mi>313.1</mi><mspace></mspace><msup><mrow><mi>GOps/mm</mi></mrow><mrow><mi>2</mi></mrow></msup></mrow></math></span>) and energy efficiency (up to <span><math><mrow><mi>289</mi><mspace></mspace><mi>GOps/W</mi></mrow></math></span>) of each CGRA configuration.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"113 ","pages":"Article 105142"},"PeriodicalIF":1.9000,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Coarse-Grained Reconfigurable Array architecture for machine learning applications in space using DARE65T library platform\",\"authors\":\"Luca Zulberti ,&nbsp;Matteo Monopoli ,&nbsp;Pietro Nannipieri ,&nbsp;Silvia Moranti ,&nbsp;Geert Thys ,&nbsp;Luca Fanucci\",\"doi\":\"10.1016/j.micpro.2025.105142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the increasing use of satellites, rovers, and other space exploration devices, Artificial Intelligence (AI) is also becoming an important tool for space exploration, allowing autonomous decision-making and operations in harsh environments. As a result, there is an increasing demand for reliable and energy-efficient processing platforms in the space industry. Among all processing architectures, Coarse-Grained Reconfigurable Arrays (CGRAs) are becoming popular, particularly in data-intensive applications like machine learning, demonstrating a substantial improvement in the energy efficiency of inference operations while preserving a good degree of versatility. In high-level class space missions, the hardware platforms incorporate radiation-hardened Field Programmable Gate Arrays (FPGAs) and microcontrollers, which do not meet the performance requirements for the aforementioned AI applications. The use of CGRA architectures in space missions is still not widely studied. The main contribution of this work is a comprehensive Design Space Exploration (DSE) activity with our highly parameterized CGRA architecture, exploring the costs associated with various design parameters when targeting AI in the space domain. We evaluated performance, power consumption, and area occupation after synthesis on the radiation-hardened DARE65T standard cell library developed by imec, based on a commercial 65 nm technology process. We characterize different CGRA configurations, comparing them with state-of-the-art solutions used for the acceleration of the AI algorithms. This work highlights Performance, Power, and Area (PPA) results that range from <span><math><mrow><mi>100</mi><mspace></mspace><mi>MHz</mi></mrow></math></span> (up to <span><math><mrow><mi>600</mi><mspace></mspace><mi>MOps</mi></mrow></math></span>), <span><math><mrow><mi>2.43</mi><mo>×</mo><msup><mrow><mi>10</mi></mrow><mrow><mi>4</mi></mrow></msup><mspace></mspace><mstyle><mstyle><mi>μ</mi></mstyle></mstyle><msup><mrow><mi>m</mi></mrow><mrow><mi>2</mi></mrow></msup></mrow></math></span> cell area occupation and <span><math><mrow><mi>0.699</mi><mspace></mspace><mi>mW</mi></mrow></math></span> power consumption, to <span><math><mrow><mi>625</mi><mspace></mspace><mi>MHz</mi></mrow></math></span> (up to <span><math><mrow><mi>3.75</mi><mspace></mspace><mi>GOps</mi></mrow></math></span>), <span><math><mrow><mi>2.43</mi><mo>×</mo><msup><mrow><mi>10</mi></mrow><mrow><mi>5</mi></mrow></msup><mspace></mspace><mstyle><mstyle><mi>μ</mi></mstyle></mstyle><msup><mrow><mi>m</mi></mrow><mrow><mi>2</mi></mrow></msup><mo>,</mo><mi>46.5</mi><mspace></mspace><mi>mW</mi></mrow></math></span>. During DSE activity, we highlight the optimal solutions in terms of area efficiency (up to <span><math><mrow><mi>313.1</mi><mspace></mspace><msup><mrow><mi>GOps/mm</mi></mrow><mrow><mi>2</mi></mrow></msup></mrow></math></span>) and energy efficiency (up to <span><math><mrow><mi>289</mi><mspace></mspace><mi>GOps/W</mi></mrow></math></span>) of each CGRA configuration.</div></div>\",\"PeriodicalId\":49815,\"journal\":{\"name\":\"Microprocessors and Microsystems\",\"volume\":\"113 \",\"pages\":\"Article 105142\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-01-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Microprocessors and Microsystems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141933125000109\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessors and Microsystems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141933125000109","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

随着卫星、漫游者和其他空间探索设备的使用越来越多,人工智能(AI)也成为空间探索的重要工具,可以在恶劣环境下自主决策和操作。因此,航天工业对可靠和节能的处理平台的需求日益增加。在所有的处理架构中,粗粒度可重构阵列(CGRAs)正变得越来越流行,特别是在数据密集型应用中,如机器学习,在保持良好通用性的同时,证明了推理操作的能源效率的大幅提高。在高级别空间任务中,硬件平台包含抗辐射的现场可编程门阵列(fpga)和微控制器,它们不满足上述人工智能应用的性能要求。CGRA结构在空间任务中的应用还没有得到广泛的研究。这项工作的主要贡献是利用我们高度参数化的CGRA架构进行全面的设计空间探索(DSE)活动,探索在空间领域瞄准人工智能时与各种设计参数相关的成本。我们评估了imec基于商用65纳米工艺开发的抗辐射DARE65T标准细胞库合成后的性能、功耗和面积占用。我们描述了不同的CGRA配置,并将它们与用于加速AI算法的最先进解决方案进行了比较。这项工作突出了性能,功率和面积(PPA)结果,范围从100MHz(高达600MOps), 2.43×104μm2小区面积占用和0.699mW功耗,到625MHz(高达3.75GOps), 2.43×105μm2,46.5mW。在DSE活动期间,我们强调了每个CGRA配置在面积效率(高达313.1GOps/mm2)和能源效率(高达289GOps/W)方面的最佳解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient Coarse-Grained Reconfigurable Array architecture for machine learning applications in space using DARE65T library platform
With the increasing use of satellites, rovers, and other space exploration devices, Artificial Intelligence (AI) is also becoming an important tool for space exploration, allowing autonomous decision-making and operations in harsh environments. As a result, there is an increasing demand for reliable and energy-efficient processing platforms in the space industry. Among all processing architectures, Coarse-Grained Reconfigurable Arrays (CGRAs) are becoming popular, particularly in data-intensive applications like machine learning, demonstrating a substantial improvement in the energy efficiency of inference operations while preserving a good degree of versatility. In high-level class space missions, the hardware platforms incorporate radiation-hardened Field Programmable Gate Arrays (FPGAs) and microcontrollers, which do not meet the performance requirements for the aforementioned AI applications. The use of CGRA architectures in space missions is still not widely studied. The main contribution of this work is a comprehensive Design Space Exploration (DSE) activity with our highly parameterized CGRA architecture, exploring the costs associated with various design parameters when targeting AI in the space domain. We evaluated performance, power consumption, and area occupation after synthesis on the radiation-hardened DARE65T standard cell library developed by imec, based on a commercial 65 nm technology process. We characterize different CGRA configurations, comparing them with state-of-the-art solutions used for the acceleration of the AI algorithms. This work highlights Performance, Power, and Area (PPA) results that range from 100MHz (up to 600MOps), 2.43×104μm2 cell area occupation and 0.699mW power consumption, to 625MHz (up to 3.75GOps), 2.43×105μm2,46.5mW. During DSE activity, we highlight the optimal solutions in terms of area efficiency (up to 313.1GOps/mm2) and energy efficiency (up to 289GOps/W) of each CGRA configuration.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Microprocessors and Microsystems
Microprocessors and Microsystems 工程技术-工程:电子与电气
CiteScore
6.90
自引率
3.80%
发文量
204
审稿时长
172 days
期刊介绍: Microprocessors and Microsystems: Embedded Hardware Design (MICPRO) is a journal covering all design and architectural aspects related to embedded systems hardware. This includes different embedded system hardware platforms ranging from custom hardware via reconfigurable systems and application specific processors to general purpose embedded processors. Special emphasis is put on novel complex embedded architectures, such as systems on chip (SoC), systems on a programmable/reconfigurable chip (SoPC) and multi-processor systems on a chip (MPSoC), as well as, their memory and communication methods and structures, such as network-on-chip (NoC). Design automation of such systems including methodologies, techniques, flows and tools for their design, as well as, novel designs of hardware components fall within the scope of this journal. Novel cyber-physical applications that use embedded systems are also central in this journal. While software is not in the main focus of this journal, methods of hardware/software co-design, as well as, application restructuring and mapping to embedded hardware platforms, that consider interplay between software and hardware components with emphasis on hardware, are also in the journal scope.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信