人工智能芯片套装

2023 IEEE International 3D Systems Integration Conference (3DIC) Pub Date : 2023-05-10 DOI:10.1109/3DIC57175.2023.10154953

Joshua A. Stevens, Tse-Han Pan, P. P. Ravichandiran, P. Franzon

{"title":"人工智能芯片套装","authors":"Joshua A. Stevens, Tse-Han Pan, P. P. Ravichandiran, P. Franzon","doi":"10.1109/3DIC57175.2023.10154953","DOIUrl":null,"url":null,"abstract":"The design reuse strategy has significantly shortened the time required to create complex System on Chips (SoCs). However, when introducing new intellectual properties (IPs), the monolithic SoC methodology requires a re-run of system-level validation steps, incurring significant costs. Partitioning the design into chiplets over an interposer would mitigate these issues by consigning the IP updates to the individual chiplet. This paper presents a chipletized design used for Artificial Intelligence (AI). This design details a scalable AI chiplet set, along with Central Processing Units (CPUs). The AI chiplet set includes an Long Short Term Memory (LSTM) Application Specific Instruction Set Processor (ASIP) for accelerating inference and training and an Sparse Convolution Neural Network (SCNN) ASIP for accelerating inference through a zero-skipping technique. The CPUs control AI accelerators and handle general tasks. The accelerators and CPUs have an AXI crossbar Network on Chip (NoC) for memory and one for controlling the accelerators. This project has two phases: phase one, IP validation with an emulated interposer (No interposer, connect chiplets through back end of line (BEOL) metal layers), and phase two, connecting validated IP through an interposer. This paper focuses on phase one, which uses the United Semiconductor Japan Co. (USJC) 55 nm LP process to fabricate the design. The chiplets' clock frequencies range from 200 - 400 MHz.","PeriodicalId":245299,"journal":{"name":"2023 IEEE International 3D Systems Integration Conference (3DIC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chiplet Set For Artificial Intelligence\",\"authors\":\"Joshua A. Stevens, Tse-Han Pan, P. P. Ravichandiran, P. Franzon\",\"doi\":\"10.1109/3DIC57175.2023.10154953\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The design reuse strategy has significantly shortened the time required to create complex System on Chips (SoCs). However, when introducing new intellectual properties (IPs), the monolithic SoC methodology requires a re-run of system-level validation steps, incurring significant costs. Partitioning the design into chiplets over an interposer would mitigate these issues by consigning the IP updates to the individual chiplet. This paper presents a chipletized design used for Artificial Intelligence (AI). This design details a scalable AI chiplet set, along with Central Processing Units (CPUs). The AI chiplet set includes an Long Short Term Memory (LSTM) Application Specific Instruction Set Processor (ASIP) for accelerating inference and training and an Sparse Convolution Neural Network (SCNN) ASIP for accelerating inference through a zero-skipping technique. The CPUs control AI accelerators and handle general tasks. The accelerators and CPUs have an AXI crossbar Network on Chip (NoC) for memory and one for controlling the accelerators. This project has two phases: phase one, IP validation with an emulated interposer (No interposer, connect chiplets through back end of line (BEOL) metal layers), and phase two, connecting validated IP through an interposer. This paper focuses on phase one, which uses the United Semiconductor Japan Co. (USJC) 55 nm LP process to fabricate the design. The chiplets' clock frequencies range from 200 - 400 MHz.\",\"PeriodicalId\":245299,\"journal\":{\"name\":\"2023 IEEE International 3D Systems Integration Conference (3DIC)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International 3D Systems Integration Conference (3DIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/3DIC57175.2023.10154953\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International 3D Systems Integration Conference (3DIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3DIC57175.2023.10154953","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

设计重用策略大大缩短了创建复杂的片上系统(soc)所需的时间。然而，当引入新的知识产权(ip)时，单片SoC方法需要重新运行系统级验证步骤，从而产生巨大的成本。通过中间层将设计划分为小芯片，可以通过将IP更新分配给单个小芯片来缓解这些问题。提出了一种用于人工智能(AI)的芯片化设计。该设计详细介绍了一个可扩展的AI芯片集，以及中央处理器(cpu)。该AI芯片集包括一个用于加速推理和训练的长短期记忆(LSTM)应用特定指令集处理器(ASIP)和一个用于通过跳零技术加速推理的稀疏卷积神经网络(SCNN) ASIP。cpu控制AI加速器并处理一般任务。加速器和cpu有一个用于存储的AXI交叉棒片上网络(NoC)和一个用于控制加速器的片上网络。该项目分为两个阶段:第一阶段，使用模拟中介进行IP验证(无中介，通过后端线(BEOL)金属层连接小芯片)，第二阶段，通过中介连接验证的IP。本文的重点是第一阶段，采用联合半导体日本公司(USJC) 55nm LP工艺制造设计。小芯片的时钟频率范围为200 - 400mhz。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Chiplet Set For Artificial Intelligence

The design reuse strategy has significantly shortened the time required to create complex System on Chips (SoCs). However, when introducing new intellectual properties (IPs), the monolithic SoC methodology requires a re-run of system-level validation steps, incurring significant costs. Partitioning the design into chiplets over an interposer would mitigate these issues by consigning the IP updates to the individual chiplet. This paper presents a chipletized design used for Artificial Intelligence (AI). This design details a scalable AI chiplet set, along with Central Processing Units (CPUs). The AI chiplet set includes an Long Short Term Memory (LSTM) Application Specific Instruction Set Processor (ASIP) for accelerating inference and training and an Sparse Convolution Neural Network (SCNN) ASIP for accelerating inference through a zero-skipping technique. The CPUs control AI accelerators and handle general tasks. The accelerators and CPUs have an AXI crossbar Network on Chip (NoC) for memory and one for controlling the accelerators. This project has two phases: phase one, IP validation with an emulated interposer (No interposer, connect chiplets through back end of line (BEOL) metal layers), and phase two, connecting validated IP through an interposer. This paper focuses on phase one, which uses the United Semiconductor Japan Co. (USJC) 55 nm LP process to fabricate the design. The chiplets' clock frequencies range from 200 - 400 MHz.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE International 3D Systems Integration Conference (3DIC)

自引率

0.00%

发文量