Empowering LLMs for Verilog Generation through Multi-Level Summarization

Yang Zhao, Di Huang, Chongxiao Li, Pengwei Jin, Ziyuan Nan, Tianyun Ma, Lei Qi, Yansong Pan, Zhenxing Zhang, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen
{"title":"Empowering LLMs for Verilog Generation through Multi-Level Summarization","authors":"Yang Zhao, Di Huang, Chongxiao Li, Pengwei Jin, Ziyuan Nan, Tianyun Ma, Lei Qi, Yansong Pan, Zhenxing Zhang, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen","doi":"arxiv-2407.10424","DOIUrl":null,"url":null,"abstract":"The increasing complexity and high costs associated with modern processor\ndesign have led to a surge in demand for processor design automation.\nInstruction-tuned large language models (LLMs) have demonstrated remarkable\nperformance in automatically generating code for general-purpose programming\nlanguages like Python. However, these methods fail on hardware description\nlanguages (HDLs) like Verilog due to the scarcity of high-quality instruction\ntuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on\nVerilog generation. Regarding this issue, we observe that (1) Verilog code\ncollected from the real world has higher quality than those generated by LLMs.\n(2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating\nit. Based on these observations, this paper introduces CodeV, a series of\nopen-source instruction-tuned Verilog generation LLMs. Instead of generating\ndescriptions first and then getting the corresponding code from advanced LLMs,\nwe prompt the LLM with Verilog code and let the LLM generate the corresponding\nnatural language description by multi-level summarization. Experimental results\nshow that CodeV relatively surpasses the previous open-source SOTA by 14.4%\n(BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also\nrelatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":"39 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.10424","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The increasing complexity and high costs associated with modern processor design have led to a surge in demand for processor design automation. Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python. However, these methods fail on hardware description languages (HDLs) like Verilog due to the scarcity of high-quality instruction tuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on Verilog generation. Regarding this issue, we observe that (1) Verilog code collected from the real world has higher quality than those generated by LLMs. (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. Instead of generating descriptions first and then getting the corresponding code from advanced LLMs, we prompt the LLM with Verilog code and let the LLM generate the corresponding natural language description by multi-level summarization. Experimental results show that CodeV relatively surpasses the previous open-source SOTA by 14.4% (BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also relatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.
通过多层次总结为 Verilog 生成赋予 LLM 能力
现代工艺设计的复杂性和高成本不断增加,导致对处理器设计自动化的需求激增。指令调谐大型语言模型(LLM)在为 Python 等通用编程语言自动生成代码方面表现出色。然而,由于高质量指令调谐数据的稀缺,这些方法在 Verilog 等硬件描述语言(HDL)上失效了,因为即使是 GPT-3.5 这样先进的 LLM,在 Verilog 生成上也表现出有限的性能。关于这个问题,我们观察到:(1) 从现实世界中收集的 Verilog 代码比 LLM 生成的代码质量更高;(2) GPT-3.5 等 LLM 擅长总结 Verilog 代码,而不是生成 Verilog 代码。基于这些观察结果,本文介绍了 CodeV,一系列开源指令调整的 Verilog 生成 LLM。我们不是先生成描述,然后再从高级 LLM 获取相应的代码,而是用 Verilog 代码提示 LLM,让 LLM 通过多级总结生成相应的自然语言描述。实验结果表明,CodeV分别以14.4%(VerilogEval中的BetterV)和11.3%(RTLLM中的RTLCoder)的成绩超过了之前的开源SOTA,在VerilogEval中以22.1%的成绩超过了之前的商业SOTA GPT-4。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信