GEMMV: An LLM-Based Automated Performance-Aware Framework for GEMM Verilog Generation

IF 3.8 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Gaoche Zhang;Dingyang Zou;Kairui Sun;Zhihuan Chen;Meiqi Wang;Zhongfeng Wang
{"title":"GEMMV: An LLM-Based Automated Performance-Aware Framework for GEMM Verilog Generation","authors":"Gaoche Zhang;Dingyang Zou;Kairui Sun;Zhihuan Chen;Meiqi Wang;Zhongfeng Wang","doi":"10.1109/JETCAS.2025.3568712","DOIUrl":null,"url":null,"abstract":"Recent advancements in artificial intelligence (AI) models have intensified the need for specialized AI accelerators. The design of optimized general matrix multiplication (GEMM) module tailored for these accelerators is crucial but time-consuming and expertise-demanding, creating a demand for automating design processes. Large language models (LLMs), capable of generating high-quality designs from human instructions, show great promise in automating GEMM module creation. However, the GEMM module’s vast design space and stringent performance requirements, along with the limitations of datasets and the lack of hardware performance awareness of LLMs, have made previous LLM-based register transfer level (RTL) code generation efforts unsuitable for GEMM design. To tackle these challenges, this paper proposes an automated performance-aware LLM-based framework, GEMMV, for generating high-correctness and high-performance Verilog code for GEMM. This framework utilizes in-context learning based on GPT-4 to automatically generate high-quality and well-annotated Verilog code for different variants of the GEMM. Additionally, it leverages in-context learning to obtain performance awareness by integrating a multi-level performance model (MLPM) with fine-tuned LLMs. The Verilog code generated by this framework reduces latency by 3.1x and improves syntax correctness by 65% and functionality correctness by 70% compared to earlier efforts.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"325-336"},"PeriodicalIF":3.8000,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10994474/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advancements in artificial intelligence (AI) models have intensified the need for specialized AI accelerators. The design of optimized general matrix multiplication (GEMM) module tailored for these accelerators is crucial but time-consuming and expertise-demanding, creating a demand for automating design processes. Large language models (LLMs), capable of generating high-quality designs from human instructions, show great promise in automating GEMM module creation. However, the GEMM module’s vast design space and stringent performance requirements, along with the limitations of datasets and the lack of hardware performance awareness of LLMs, have made previous LLM-based register transfer level (RTL) code generation efforts unsuitable for GEMM design. To tackle these challenges, this paper proposes an automated performance-aware LLM-based framework, GEMMV, for generating high-correctness and high-performance Verilog code for GEMM. This framework utilizes in-context learning based on GPT-4 to automatically generate high-quality and well-annotated Verilog code for different variants of the GEMM. Additionally, it leverages in-context learning to obtain performance awareness by integrating a multi-level performance model (MLPM) with fine-tuned LLMs. The Verilog code generated by this framework reduces latency by 3.1x and improves syntax correctness by 65% and functionality correctness by 70% compared to earlier efforts.
GEMMV:用于GEMM Verilog生成的基于llm的自动性能感知框架
人工智能(AI)模型的最新进展加剧了对专门的AI加速器的需求。为这些加速器量身定制的优化通用矩阵乘法(GEMM)模块的设计至关重要,但耗时且对专业知识要求很高,因此需要自动化设计过程。大型语言模型(llm)能够根据人类指令生成高质量的设计,在自动化GEMM模块创建方面显示出巨大的前景。然而,GEMM模块巨大的设计空间和严格的性能要求,加上数据集的限制和缺乏对llm硬件性能的认识,使得以前基于llm的寄存器传输级别(RTL)代码生成工作不适合GEMM设计。为了应对这些挑战,本文提出了一个基于性能感知的自动化llm框架GEMMV,用于为GEMM生成高正确性和高性能的Verilog代码。该框架利用基于GPT-4的上下文学习,为GEMM的不同变体自动生成高质量和注释良好的Verilog代码。此外,它利用上下文学习,通过集成多级性能模型(MLPM)和微调llm来获得性能感知。与之前的成果相比,这个框架生成的Verilog代码延迟减少了3.1倍,语法正确性提高了65%,功能正确性提高了70%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.50
自引率
2.20%
发文量
86
期刊介绍: The IEEE Journal on Emerging and Selected Topics in Circuits and Systems is published quarterly and solicits, with particular emphasis on emerging areas, special issues on topics that cover the entire scope of the IEEE Circuits and Systems (CAS) Society, namely the theory, analysis, design, tools, and implementation of circuits and systems, spanning their theoretical foundations, applications, and architectures for signal and information processing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信