人工智能芯片套装

Joshua A. Stevens, Tse-Han Pan, P. P. Ravichandiran, P. Franzon
{"title":"人工智能芯片套装","authors":"Joshua A. Stevens, Tse-Han Pan, P. P. Ravichandiran, P. Franzon","doi":"10.1109/3DIC57175.2023.10154953","DOIUrl":null,"url":null,"abstract":"The design reuse strategy has significantly shortened the time required to create complex System on Chips (SoCs). However, when introducing new intellectual properties (IPs), the monolithic SoC methodology requires a re-run of system-level validation steps, incurring significant costs. Partitioning the design into chiplets over an interposer would mitigate these issues by consigning the IP updates to the individual chiplet. This paper presents a chipletized design used for Artificial Intelligence (AI). This design details a scalable AI chiplet set, along with Central Processing Units (CPUs). The AI chiplet set includes an Long Short Term Memory (LSTM) Application Specific Instruction Set Processor (ASIP) for accelerating inference and training and an Sparse Convolution Neural Network (SCNN) ASIP for accelerating inference through a zero-skipping technique. The CPUs control AI accelerators and handle general tasks. The accelerators and CPUs have an AXI crossbar Network on Chip (NoC) for memory and one for controlling the accelerators. This project has two phases: phase one, IP validation with an emulated interposer (No interposer, connect chiplets through back end of line (BEOL) metal layers), and phase two, connecting validated IP through an interposer. This paper focuses on phase one, which uses the United Semiconductor Japan Co. (USJC) 55 nm LP process to fabricate the design. The chiplets' clock frequencies range from 200 - 400 MHz.","PeriodicalId":245299,"journal":{"name":"2023 IEEE International 3D Systems Integration Conference (3DIC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chiplet Set For Artificial Intelligence\",\"authors\":\"Joshua A. Stevens, Tse-Han Pan, P. P. Ravichandiran, P. Franzon\",\"doi\":\"10.1109/3DIC57175.2023.10154953\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The design reuse strategy has significantly shortened the time required to create complex System on Chips (SoCs). However, when introducing new intellectual properties (IPs), the monolithic SoC methodology requires a re-run of system-level validation steps, incurring significant costs. Partitioning the design into chiplets over an interposer would mitigate these issues by consigning the IP updates to the individual chiplet. This paper presents a chipletized design used for Artificial Intelligence (AI). This design details a scalable AI chiplet set, along with Central Processing Units (CPUs). The AI chiplet set includes an Long Short Term Memory (LSTM) Application Specific Instruction Set Processor (ASIP) for accelerating inference and training and an Sparse Convolution Neural Network (SCNN) ASIP for accelerating inference through a zero-skipping technique. The CPUs control AI accelerators and handle general tasks. The accelerators and CPUs have an AXI crossbar Network on Chip (NoC) for memory and one for controlling the accelerators. This project has two phases: phase one, IP validation with an emulated interposer (No interposer, connect chiplets through back end of line (BEOL) metal layers), and phase two, connecting validated IP through an interposer. This paper focuses on phase one, which uses the United Semiconductor Japan Co. (USJC) 55 nm LP process to fabricate the design. The chiplets' clock frequencies range from 200 - 400 MHz.\",\"PeriodicalId\":245299,\"journal\":{\"name\":\"2023 IEEE International 3D Systems Integration Conference (3DIC)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International 3D Systems Integration Conference (3DIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/3DIC57175.2023.10154953\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International 3D Systems Integration Conference (3DIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3DIC57175.2023.10154953","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

设计重用策略大大缩短了创建复杂的片上系统(soc)所需的时间。然而,当引入新的知识产权(ip)时,单片SoC方法需要重新运行系统级验证步骤,从而产生巨大的成本。通过中间层将设计划分为小芯片,可以通过将IP更新分配给单个小芯片来缓解这些问题。提出了一种用于人工智能(AI)的芯片化设计。该设计详细介绍了一个可扩展的AI芯片集,以及中央处理器(cpu)。该AI芯片集包括一个用于加速推理和训练的长短期记忆(LSTM)应用特定指令集处理器(ASIP)和一个用于通过跳零技术加速推理的稀疏卷积神经网络(SCNN) ASIP。cpu控制AI加速器并处理一般任务。加速器和cpu有一个用于存储的AXI交叉棒片上网络(NoC)和一个用于控制加速器的片上网络。该项目分为两个阶段:第一阶段,使用模拟中介进行IP验证(无中介,通过后端线(BEOL)金属层连接小芯片),第二阶段,通过中介连接验证的IP。本文的重点是第一阶段,采用联合半导体日本公司(USJC) 55nm LP工艺制造设计。小芯片的时钟频率范围为200 - 400mhz。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Chiplet Set For Artificial Intelligence
The design reuse strategy has significantly shortened the time required to create complex System on Chips (SoCs). However, when introducing new intellectual properties (IPs), the monolithic SoC methodology requires a re-run of system-level validation steps, incurring significant costs. Partitioning the design into chiplets over an interposer would mitigate these issues by consigning the IP updates to the individual chiplet. This paper presents a chipletized design used for Artificial Intelligence (AI). This design details a scalable AI chiplet set, along with Central Processing Units (CPUs). The AI chiplet set includes an Long Short Term Memory (LSTM) Application Specific Instruction Set Processor (ASIP) for accelerating inference and training and an Sparse Convolution Neural Network (SCNN) ASIP for accelerating inference through a zero-skipping technique. The CPUs control AI accelerators and handle general tasks. The accelerators and CPUs have an AXI crossbar Network on Chip (NoC) for memory and one for controlling the accelerators. This project has two phases: phase one, IP validation with an emulated interposer (No interposer, connect chiplets through back end of line (BEOL) metal layers), and phase two, connecting validated IP through an interposer. This paper focuses on phase one, which uses the United Semiconductor Japan Co. (USJC) 55 nm LP process to fabricate the design. The chiplets' clock frequencies range from 200 - 400 MHz.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信