Special issue on next-gen AI and quantum technology

IF 1.3 4区 计算机科学 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC
ETRI Journal Pub Date : 2024-10-28 DOI:10.4218/etr2.12735
Ji-Hoon Kim, Ho-Young Cha, Daewoong Kwon, Gyu Sang Choi, HeeSeok Kim, Yousung Kang
{"title":"Special issue on next-gen AI and quantum technology","authors":"Ji-Hoon Kim,&nbsp;Ho-Young Cha,&nbsp;Daewoong Kwon,&nbsp;Gyu Sang Choi,&nbsp;HeeSeok Kim,&nbsp;Yousung Kang","doi":"10.4218/etr2.12735","DOIUrl":null,"url":null,"abstract":"<p>Artificial intelligence (AI) and quantum technology are two key fields that drive the development of modern science and technology, and their developments have had tremendous impacts on academia and industry. AI is a technology that can solve complex problems through data-based learning and inference and is already driving innovation in various industries such as healthcare, finance, and manufacturing. In particular, the development of AI has enabled practical applications in autonomous driving, natural language processing, and image recognition, greatly improving the quality of human life.</p><p>Quantum technology utilizes the principles of quantum mechanics to provide new computational capabilities beyond the scope of classical computing methods. Quantum computing has the potential to perform multiple calculations simultaneously using quantum bits (qubits), which is expected to lead to innovative results in complex optimization problems and the analysis of large datasets. Additionally, quantum technology plays an important role in secure communication, with technologies such as quantum key distribution (QKD) providing security surpassing that of existing encryption methods.</p><p>The Electronics and Telecommunications Research Institute (ETRI) Journal is a peer-reviewed open-access journal that launched in 1993 and is published bimonthly by ETRI of the Republic of Korea, aiming to promote worldwide academic exchange in information, telecommunications, and electronics. This special issue of the ETRI Journal focuses on exploring the latest research on these cutting-edge technologies and highlighting the challenges and opportunities that each technology presents. The research included in this special issue clearly demonstrates the significant impact that each of the advancements in both AI and quantum technologies have on academia and industry. AI is already driving change in many fields and is focused on creating more efficient and intelligent systems. In contrast, quantum technologies are introducing a novel computing paradigm, revealing groundbreaking possibilities for computational power and secure communication.</p><p>The papers selected for this special issue cover various aspects of AI and quantum technologies. In the AI field, the latest hardware architectures, energy-efficient AI systems such as spiking neural networks (SNNs), and AI application technologies such as anomaly detection are introduced. In the field of quantum technology, theoretical developments in quantum computing, quantum photonic systems, and secure communication technologies such as QKD are discussed.</p><p>The first paper [<span>1</span>], titled “Trends in quantum reinforcement learning: State-of-the-arts and the road ahead by Park and Kim,” is an invited paper. This paper presents the foundational quantum reinforcement learning theory and explores quantum-neural-network-based reinforcement learning models with advantages such as fast training and scalability. It also discusses multi-agent applications, including quantum-centralized critics and multiple-actor networks. Future research directions include federated learning, autonomous control, and quantum deep learning software testing.</p><p>In the second paper [<span>2</span>], titled “Optimal execution of logical hadamard with low-space overhead in rotated surface code,” Lee et al. propose a novel method for executing the logical Hadamard operation in rotated surface codes with minimal space requirements. Using boundary deformation, this method rotates the logical qubit affected by the transversal Hadamard and restores its original encoding through logical flip-and-shift operations. The space–time cost for this approach is 5<i>d</i><sup>3</sup> + 3<i>d</i><sup>2</sup>, offering an efficiency that is approximately four times greater than that of previous methods. It requires only two patches for implementation, in contrast to the traditional seven patches, and maintains parallelism in quantum circuits by avoiding interference between adjacent logical qubits.</p><p>The third paper [<span>3</span>], titled “Quantum electrodynamical formulation of photochemical acid generation and its implications on optical lithography” by Lee, refines photochemical acid generation using quantum electrodynamics principles, providing a probabilistic description of acid generation and deprotection density in photoresists. It combines quantum mechanical acid generation with deprotection mechanisms to analyze stochastic feature formation, thereby offering key insights into the randomness of the deprotection process.</p><p>The fourth paper [<span>4</span>], titled “Fabrication of low-loss symmetrical rib waveguides based on 𝑥-cut lithium niobate on insulator for integrated quantum photonics” by Kim et al., presents the fabrication of a low-loss lithium niobate on insulator (LNOI) rib waveguide with an optical propagation loss of 0.16 dB/cm, which was achieved by optimizing etching parameters. The shallow etching process improves waveguide symmetry and smoothness on 𝑥-cut LNOI, supporting advancements in on-chip quantum photonic devices.</p><p>The fifth paper [<span>5</span>], titled “Metaheuristic optimization scheme for quantum kernel classifiers using entanglement-directed graphs” by Tjandra and Sugiarto, presents a novel meta-heuristic approach using a genetic algorithm to optimize quantum kernel classifiers by incorporating entanglement-directed graphs. This method effectively enhances classification performance by designing quantum circuits that leverage entanglement, outperforming classical and other quantum baselines across various datasets. The results demonstrate that the proposed approach successfully identifies optimal entanglement structures for specific datasets, leading to significant improvements in classification accuracy and F1 scores.</p><p>In the sixth paper [<span>6</span>], titled “Free-space quantum key distribution transmitter system using wavelength-division multiplexing (WDM) filter for channel integration,” Kim et al. introduce a free-space QKD transmitter system using the BB84 protocol that eliminates the need for internal alignment. It uses a custom WDM filter and polarization-encoding module to integrate the quantum and synchronization channels. This integration avoids the complex alignment processes required for conventional systems with bulk optics. The WDM filter efficiently multiplexes 785 and 1550 nm signals, with insertion losses of 1.8 and 0.7 dB, respectively. The system achieved a sifted key rate of 1.6 Mbps and qubit error rate of 0.62% at 100 MHz, exhibiting performance comparable to that of traditional bulk-optic devices.</p><p>The seventh paper [<span>7</span>], titled “PF-GEMV: Utilization maximizing architecture in fast matrix-vector multiplication for GPT-2 inference” by Kim et al., presents solutions for overcoming the challenges of processing matrix–vector multiplications (GEMVs). It examines the challenges faced by AI processors in light of the rapid advancement of transformer-based artificial neural networks, particularly the need to perform matrix–vector multiplication efficiently alongside traditional matrix–matrix multiplication (GEMM). The authors noted that existing AI processor architectures are primarily optimized for GEMMs, leading to considerable throughput degradation when handling GEMV.</p><p>To address this issue, their paper introduces a port-folding GEMV scheme that incorporates multi-format and low-precision techniques while leveraging an outer-product-based processor designed for conventional GEMM operations. This innovative approach achieved an impressive 93.7% utilization on GEMV tasks with an eight-bit format on an 8 × 8 processor, resulting in a 7.5 × throughput increase compared with the original design. Additionally, when applied to the matrix operations of the GPT-2 large model, the proposed scheme demonstrated a remarkable 7 × speedup on single-batch inferences. The eighth paper [<span>8</span>], titled “SNN eXpress: Streamlining low-power AI-SoC development with unsigned weight accumulation spiking neural network” by Jang et al., presents solutions for overcoming the challenges of developing low-power AI-SoCs using analog-circuit-based unsigned weight accumulating spiking neural networks (UWA-SNNs). It introduces the SNN eXpress tool, which automates the design process, enabling the rapid development and verification of UWA-SNN-based AI-SoCs, as demonstrated by the creation of two AI-SoCs.</p><p>In the ninth paper [<span>9</span>], titled “XEM: Tensor accelerator for AB21 supercomputing artificial intelligence processor,” Jeon et al. introduce the XEM accelerator, which is designed to enhance the AB21 supercomputing AI processor's efficiency when performing the tensor-based linear-algebraic operations that are crucial for hyperscale AI and high-performance computing applications. The XEM architecture, with its outer-product-based parallel floating-point units, is detailed along with new instructions and hardware characteristics, with future verification planned alongside the AB21 processor chip.</p><p>The tenth paper [<span>10</span>], titled “NEST-C: A deep learning compiler framework for heterogeneous computing systems with AI accelerators” by Park et al., introduces NEST-C, a deep learning compiler framework designed to optimize the deployment and performance of deep learning models across various AI accelerators. This framework achieves significant computational efficiency by incorporating profiling-based quantization, dynamic graph partitioning, and multilevel intermediate representation integration. The experimental results demonstrate that NEST-C enhances throughput and reduces latency across different hardware platforms, making it a versatile tool for modern AI applications.</p><p>The eleventh paper [<span>11</span>], titled “Mixed-mode SNN crossbar array with embedded dummy switch and mid-node pre-charge scheme” by Park et al., presents solutions for overcoming the challenges of processing GEMVs. This paper introduces a membrane computation error-minimized mixed-mode SNN crossbar array. The authors implemented an embedded dummy switch scheme along with a mid-node pre-charge technique to create a high-precision current-mode synapse. This innovative approach effectively mitigates charge sharing between membrane capacitors and the parasitic capacitance of synapses, thereby reducing computational errors. A prototype chip featuring a 400 × 20 SNN crossbar was fabricated using a 28 nm fully depleted silicon on insulator (FDSOI) complementary metal–oxide–semiconductor (CMOS) process, successfully recognizing 20 MNIST patterns reduced to 20 × 20 pixels with a power consumption of 411 μW. Notably, the peak-to-peak deviation of the normalized output spike count from the 21 fabricated SNN prototype chips remained within 16.5% of the ideal value, accounting for random sample-wise variations.</p><p>The twelfth paper [<span>12</span>], titled “Asynchronous interface circuit for nonlinear connectivity in multi-core spiking neural networks” by Oh et al., presents solutions for overcoming the challenges of processing GEMVs. This paper addresses the need for an interface circuit that supports multiple SNN cores to facilitate SNN expansion. The proposed circuit employs an asynchronous design approach to mimic the operational characteristics of the human brain; however, the lack of a global clock introduces timing challenges during implementation. To address this issue, the authors proposed an intermediate latching template that establishes asynchronous nonlinear connectivity and enables multi-pipeline processing utilizing multiple SNN cores.</p><p>This design incorporates arbitration and distribution blocks based on the proposed template and is fabricated as a fully custom interface circuit supporting four SNN cores in a 28 nm CMOS FDSOI process. The results indicate that this innovative template can enhance the throughput of the interface circuit by up to 53% compared with conventional asynchronous designs. Additionally, the interface circuit can transmit a spike with an energy consumption of 1.7 pJ at a supply voltage of 0.9 V, supporting 606 Mevent/s for intra-chip communication, and 3.7 pJ at the same voltage for 59 Mevent/s inter-chip communication.</p><p>The thirteenth paper [<span>13</span>], titled “AONet: Attention network with optional activation for unsupervised video anomaly detection” by Rakhmonov et al., proposes AONet, a novel attention-based neural network designed for unsupervised video anomaly detection, which incorporates a unique activation function (OptAF) combining the benefits of the rectified linear unit (ReLU), leaky ReLU, and sigmoid functions. This method efficiently captures spatiotemporal features using a temporal shift module and residual autoencoder, achieving superior performance on benchmark datasets compared with state-of-the-art methods. The model was evaluated on three datasets, demonstrating competitive accuracy and speed with an area under the curve (AUC) score of 97.0% on the UCSD Ped2 dataset.</p><p>As guest editors, we are pleased to explore the future of these cutting-edge technologies in this special issue. We express our deepest gratitude to the authors for their outstanding contributions, reviewers for their thorough review, and editorial team for their assistance in publishing this special issue. We hope that this special issue will contribute to a broader understanding of the developments in AI and quantum technology and foster further research and innovation.</p><p>The authors declare that there are no conflicts of interest.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 5","pages":"743-747"},"PeriodicalIF":1.3000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etr2.12735","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ETRI Journal","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.4218/etr2.12735","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial intelligence (AI) and quantum technology are two key fields that drive the development of modern science and technology, and their developments have had tremendous impacts on academia and industry. AI is a technology that can solve complex problems through data-based learning and inference and is already driving innovation in various industries such as healthcare, finance, and manufacturing. In particular, the development of AI has enabled practical applications in autonomous driving, natural language processing, and image recognition, greatly improving the quality of human life.

Quantum technology utilizes the principles of quantum mechanics to provide new computational capabilities beyond the scope of classical computing methods. Quantum computing has the potential to perform multiple calculations simultaneously using quantum bits (qubits), which is expected to lead to innovative results in complex optimization problems and the analysis of large datasets. Additionally, quantum technology plays an important role in secure communication, with technologies such as quantum key distribution (QKD) providing security surpassing that of existing encryption methods.

The Electronics and Telecommunications Research Institute (ETRI) Journal is a peer-reviewed open-access journal that launched in 1993 and is published bimonthly by ETRI of the Republic of Korea, aiming to promote worldwide academic exchange in information, telecommunications, and electronics. This special issue of the ETRI Journal focuses on exploring the latest research on these cutting-edge technologies and highlighting the challenges and opportunities that each technology presents. The research included in this special issue clearly demonstrates the significant impact that each of the advancements in both AI and quantum technologies have on academia and industry. AI is already driving change in many fields and is focused on creating more efficient and intelligent systems. In contrast, quantum technologies are introducing a novel computing paradigm, revealing groundbreaking possibilities for computational power and secure communication.

The papers selected for this special issue cover various aspects of AI and quantum technologies. In the AI field, the latest hardware architectures, energy-efficient AI systems such as spiking neural networks (SNNs), and AI application technologies such as anomaly detection are introduced. In the field of quantum technology, theoretical developments in quantum computing, quantum photonic systems, and secure communication technologies such as QKD are discussed.

The first paper [1], titled “Trends in quantum reinforcement learning: State-of-the-arts and the road ahead by Park and Kim,” is an invited paper. This paper presents the foundational quantum reinforcement learning theory and explores quantum-neural-network-based reinforcement learning models with advantages such as fast training and scalability. It also discusses multi-agent applications, including quantum-centralized critics and multiple-actor networks. Future research directions include federated learning, autonomous control, and quantum deep learning software testing.

In the second paper [2], titled “Optimal execution of logical hadamard with low-space overhead in rotated surface code,” Lee et al. propose a novel method for executing the logical Hadamard operation in rotated surface codes with minimal space requirements. Using boundary deformation, this method rotates the logical qubit affected by the transversal Hadamard and restores its original encoding through logical flip-and-shift operations. The space–time cost for this approach is 5d3 + 3d2, offering an efficiency that is approximately four times greater than that of previous methods. It requires only two patches for implementation, in contrast to the traditional seven patches, and maintains parallelism in quantum circuits by avoiding interference between adjacent logical qubits.

The third paper [3], titled “Quantum electrodynamical formulation of photochemical acid generation and its implications on optical lithography” by Lee, refines photochemical acid generation using quantum electrodynamics principles, providing a probabilistic description of acid generation and deprotection density in photoresists. It combines quantum mechanical acid generation with deprotection mechanisms to analyze stochastic feature formation, thereby offering key insights into the randomness of the deprotection process.

The fourth paper [4], titled “Fabrication of low-loss symmetrical rib waveguides based on 𝑥-cut lithium niobate on insulator for integrated quantum photonics” by Kim et al., presents the fabrication of a low-loss lithium niobate on insulator (LNOI) rib waveguide with an optical propagation loss of 0.16 dB/cm, which was achieved by optimizing etching parameters. The shallow etching process improves waveguide symmetry and smoothness on 𝑥-cut LNOI, supporting advancements in on-chip quantum photonic devices.

The fifth paper [5], titled “Metaheuristic optimization scheme for quantum kernel classifiers using entanglement-directed graphs” by Tjandra and Sugiarto, presents a novel meta-heuristic approach using a genetic algorithm to optimize quantum kernel classifiers by incorporating entanglement-directed graphs. This method effectively enhances classification performance by designing quantum circuits that leverage entanglement, outperforming classical and other quantum baselines across various datasets. The results demonstrate that the proposed approach successfully identifies optimal entanglement structures for specific datasets, leading to significant improvements in classification accuracy and F1 scores.

In the sixth paper [6], titled “Free-space quantum key distribution transmitter system using wavelength-division multiplexing (WDM) filter for channel integration,” Kim et al. introduce a free-space QKD transmitter system using the BB84 protocol that eliminates the need for internal alignment. It uses a custom WDM filter and polarization-encoding module to integrate the quantum and synchronization channels. This integration avoids the complex alignment processes required for conventional systems with bulk optics. The WDM filter efficiently multiplexes 785 and 1550 nm signals, with insertion losses of 1.8 and 0.7 dB, respectively. The system achieved a sifted key rate of 1.6 Mbps and qubit error rate of 0.62% at 100 MHz, exhibiting performance comparable to that of traditional bulk-optic devices.

The seventh paper [7], titled “PF-GEMV: Utilization maximizing architecture in fast matrix-vector multiplication for GPT-2 inference” by Kim et al., presents solutions for overcoming the challenges of processing matrix–vector multiplications (GEMVs). It examines the challenges faced by AI processors in light of the rapid advancement of transformer-based artificial neural networks, particularly the need to perform matrix–vector multiplication efficiently alongside traditional matrix–matrix multiplication (GEMM). The authors noted that existing AI processor architectures are primarily optimized for GEMMs, leading to considerable throughput degradation when handling GEMV.

To address this issue, their paper introduces a port-folding GEMV scheme that incorporates multi-format and low-precision techniques while leveraging an outer-product-based processor designed for conventional GEMM operations. This innovative approach achieved an impressive 93.7% utilization on GEMV tasks with an eight-bit format on an 8 × 8 processor, resulting in a 7.5 × throughput increase compared with the original design. Additionally, when applied to the matrix operations of the GPT-2 large model, the proposed scheme demonstrated a remarkable 7 × speedup on single-batch inferences. The eighth paper [8], titled “SNN eXpress: Streamlining low-power AI-SoC development with unsigned weight accumulation spiking neural network” by Jang et al., presents solutions for overcoming the challenges of developing low-power AI-SoCs using analog-circuit-based unsigned weight accumulating spiking neural networks (UWA-SNNs). It introduces the SNN eXpress tool, which automates the design process, enabling the rapid development and verification of UWA-SNN-based AI-SoCs, as demonstrated by the creation of two AI-SoCs.

In the ninth paper [9], titled “XEM: Tensor accelerator for AB21 supercomputing artificial intelligence processor,” Jeon et al. introduce the XEM accelerator, which is designed to enhance the AB21 supercomputing AI processor's efficiency when performing the tensor-based linear-algebraic operations that are crucial for hyperscale AI and high-performance computing applications. The XEM architecture, with its outer-product-based parallel floating-point units, is detailed along with new instructions and hardware characteristics, with future verification planned alongside the AB21 processor chip.

The tenth paper [10], titled “NEST-C: A deep learning compiler framework for heterogeneous computing systems with AI accelerators” by Park et al., introduces NEST-C, a deep learning compiler framework designed to optimize the deployment and performance of deep learning models across various AI accelerators. This framework achieves significant computational efficiency by incorporating profiling-based quantization, dynamic graph partitioning, and multilevel intermediate representation integration. The experimental results demonstrate that NEST-C enhances throughput and reduces latency across different hardware platforms, making it a versatile tool for modern AI applications.

The eleventh paper [11], titled “Mixed-mode SNN crossbar array with embedded dummy switch and mid-node pre-charge scheme” by Park et al., presents solutions for overcoming the challenges of processing GEMVs. This paper introduces a membrane computation error-minimized mixed-mode SNN crossbar array. The authors implemented an embedded dummy switch scheme along with a mid-node pre-charge technique to create a high-precision current-mode synapse. This innovative approach effectively mitigates charge sharing between membrane capacitors and the parasitic capacitance of synapses, thereby reducing computational errors. A prototype chip featuring a 400 × 20 SNN crossbar was fabricated using a 28 nm fully depleted silicon on insulator (FDSOI) complementary metal–oxide–semiconductor (CMOS) process, successfully recognizing 20 MNIST patterns reduced to 20 × 20 pixels with a power consumption of 411 μW. Notably, the peak-to-peak deviation of the normalized output spike count from the 21 fabricated SNN prototype chips remained within 16.5% of the ideal value, accounting for random sample-wise variations.

The twelfth paper [12], titled “Asynchronous interface circuit for nonlinear connectivity in multi-core spiking neural networks” by Oh et al., presents solutions for overcoming the challenges of processing GEMVs. This paper addresses the need for an interface circuit that supports multiple SNN cores to facilitate SNN expansion. The proposed circuit employs an asynchronous design approach to mimic the operational characteristics of the human brain; however, the lack of a global clock introduces timing challenges during implementation. To address this issue, the authors proposed an intermediate latching template that establishes asynchronous nonlinear connectivity and enables multi-pipeline processing utilizing multiple SNN cores.

This design incorporates arbitration and distribution blocks based on the proposed template and is fabricated as a fully custom interface circuit supporting four SNN cores in a 28 nm CMOS FDSOI process. The results indicate that this innovative template can enhance the throughput of the interface circuit by up to 53% compared with conventional asynchronous designs. Additionally, the interface circuit can transmit a spike with an energy consumption of 1.7 pJ at a supply voltage of 0.9 V, supporting 606 Mevent/s for intra-chip communication, and 3.7 pJ at the same voltage for 59 Mevent/s inter-chip communication.

The thirteenth paper [13], titled “AONet: Attention network with optional activation for unsupervised video anomaly detection” by Rakhmonov et al., proposes AONet, a novel attention-based neural network designed for unsupervised video anomaly detection, which incorporates a unique activation function (OptAF) combining the benefits of the rectified linear unit (ReLU), leaky ReLU, and sigmoid functions. This method efficiently captures spatiotemporal features using a temporal shift module and residual autoencoder, achieving superior performance on benchmark datasets compared with state-of-the-art methods. The model was evaluated on three datasets, demonstrating competitive accuracy and speed with an area under the curve (AUC) score of 97.0% on the UCSD Ped2 dataset.

As guest editors, we are pleased to explore the future of these cutting-edge technologies in this special issue. We express our deepest gratitude to the authors for their outstanding contributions, reviewers for their thorough review, and editorial team for their assistance in publishing this special issue. We hope that this special issue will contribute to a broader understanding of the developments in AI and quantum technology and foster further research and innovation.

The authors declare that there are no conflicts of interest.

下一代人工智能和量子技术特刊
第五篇论文[5]的标题是 "使用纠缠定向图的量子内核分类器元启发式优化方案",作者Tjandra和Sugiarto提出了一种新颖的元启发式方法,该方法使用遗传算法,通过结合纠缠定向图来优化量子内核分类器。这种方法通过设计利用纠缠的量子电路,有效提高了分类性能,在各种数据集上的表现优于经典和其他量子基线。第六篇论文[6]的标题是 "使用波分复用(WDM)滤波器进行信道集成的自由空间量子密钥分发发射机系统",Kim 等人在论文中介绍了一种使用 BB84 协议的自由空间 QKD 发射机系统,该系统无需内部对齐。该系统使用定制的波分复用滤波器和偏振编码模块来集成量子和同步信道。这种集成方式避免了传统散装光学系统所需的复杂校准过程。波分复用滤波器可有效复用 785 和 1550 nm 信号,插入损耗分别为 1.8 和 0.7 dB。第七篇论文[7]由 Kim 等人撰写,题为 "PF-GEMV:用于 GPT-2 推断的快速矩阵向量乘法中的利用率最大化架构",提出了克服处理矩阵向量乘法(GEMV)挑战的解决方案。该文探讨了人工智能处理器在基于变压器的人工神经网络快速发展的背景下所面临的挑战,尤其是在传统矩阵-矩阵乘法(GEMM)的同时高效执行矩阵-向量乘法的需求。作者指出,现有的人工智能处理器架构主要针对 GEMM 进行了优化,导致在处理 GEMV 时吞吐量大幅下降。为解决这一问题,他们在论文中介绍了一种端口折叠 GEMV 方案,该方案结合了多格式和低精度技术,同时利用了专为传统 GEMM 运算设计的基于外积的处理器。这种创新方法在 8 × 8 处理器上完成八位格式的 GEMV 任务时,利用率达到了惊人的 93.7%,与原始设计相比,吞吐量提高了 7.5 倍。此外,当应用于 GPT-2 大型模型的矩阵运算时,所提出的方案在单批推断上显著提高了 7 倍的速度。第八篇论文[8]题为 "SNN eXpress:Jang 等人撰写的题为 "SNN eXpress:利用无符号权值累加尖峰神经网络简化低功耗 AI-SoC 开发 "的第八篇论文提出了解决方案,以克服利用基于模拟电路的无符号权值累加尖峰神经网络(UWA-SNN)开发低功耗 AI-SoC 所面临的挑战。论文介绍了 SNN eXpress 工具,该工具实现了设计过程的自动化,能够快速开发和验证基于 UWA-SNN 的人工智能 SoC。在题为 "XEM:用于 AB21 超级计算人工智能处理器的张量加速器 "的第九篇论文[9]中,Jeon 等人介绍了 XEM 加速器,该加速器旨在提高 AB21 超级计算人工智能处理器执行基于张量的线性代数运算的效率,这些运算对于超大规模人工智能和高性能计算应用至关重要。第十篇论文[10]题为 "NEST-C:第十篇论文[10]由 Park 等人撰写,题为 "NEST-C:面向带有人工智能加速器的异构计算系统的深度学习编译器框架",介绍了 NEST-C,这是一个深度学习编译器框架,旨在优化深度学习模型在各种人工智能加速器上的部署和性能。该框架通过整合基于剖析的量化、动态图分割和多级中间表示集成,实现了显著的计算效率。实验结果表明,NEST-C 在不同的硬件平台上提高了吞吐量并降低了延迟,使其成为现代人工智能应用的多功能工具。第十一篇论文[11]的标题是 "具有嵌入式假开关和中间节点预充电方案的混合模式 SNN 横条阵列",作者 Park 等人提出了克服处理 GEMV 挑战的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ETRI Journal
ETRI Journal 工程技术-电信学
CiteScore
4.00
自引率
7.10%
发文量
98
审稿时长
6.9 months
期刊介绍: ETRI Journal is an international, peer-reviewed multidisciplinary journal published bimonthly in English. The main focus of the journal is to provide an open forum to exchange innovative ideas and technology in the fields of information, telecommunications, and electronics. Key topics of interest include high-performance computing, big data analytics, cloud computing, multimedia technology, communication networks and services, wireless communications and mobile computing, material and component technology, as well as security. With an international editorial committee and experts from around the world as reviewers, ETRI Journal publishes high-quality research papers on the latest and best developments from the global community.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信