基于结构化树输入和AST解码器注意力增强的代码生成方法

Wenjun Wei, Junhua Wu
{"title":"基于结构化树输入和AST解码器注意力增强的代码生成方法","authors":"Wenjun Wei, Junhua Wu","doi":"10.1109/QRS-C57518.2022.00077","DOIUrl":null,"url":null,"abstract":"Automatic code generation based on natural language input is important to research in the field of software engineering. In the past, it was mostly a seq2seq structure and used the RNN model. Input and output are regarded as simple sequences, and syntactic structure information in source information is often ignored. This paper proposes a code generation method Tx(Tree-Tree). It uses structured trees to replace simple word sequences so that the model can better learn the syntactic and semantic information in the source information. Therefore, it can alleviate the long dependency problem caused by too long source information. At the same time, the enhanced attention mechanism is adopted in the decoder to distinguish the influence of different historical actions on the current predicted action. The model is validated on three datasets: DJANGO, CONALA, and ATIS. Compared with some typical models, Tx(Tree-Tree) improves both accuracy and BLEU.","PeriodicalId":183728,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Code Generation Method based on Structured Tree Input and AST Decoder Attention Augmentation\",\"authors\":\"Wenjun Wei, Junhua Wu\",\"doi\":\"10.1109/QRS-C57518.2022.00077\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic code generation based on natural language input is important to research in the field of software engineering. In the past, it was mostly a seq2seq structure and used the RNN model. Input and output are regarded as simple sequences, and syntactic structure information in source information is often ignored. This paper proposes a code generation method Tx(Tree-Tree). It uses structured trees to replace simple word sequences so that the model can better learn the syntactic and semantic information in the source information. Therefore, it can alleviate the long dependency problem caused by too long source information. At the same time, the enhanced attention mechanism is adopted in the decoder to distinguish the influence of different historical actions on the current predicted action. The model is validated on three datasets: DJANGO, CONALA, and ATIS. Compared with some typical models, Tx(Tree-Tree) improves both accuracy and BLEU.\",\"PeriodicalId\":183728,\"journal\":{\"name\":\"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)\",\"volume\":\"113 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/QRS-C57518.2022.00077\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS-C57518.2022.00077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

基于自然语言输入的代码自动生成是软件工程领域的一个重要研究课题。在过去,它主要是一个seq2seq结构,并使用RNN模型。输入和输出被视为简单的序列,而源信息中的语法结构信息往往被忽略。本文提出了一种代码生成方法Tx(Tree-Tree)。它使用结构化的树来代替简单的词序列,使模型能够更好地学习源信息中的语法和语义信息。因此,它可以缓解由于源信息过长而导致的长依赖问题。同时,在解码器中采用增强注意机制,区分不同历史动作对当前预测动作的影响。模型在三个数据集上进行了验证:DJANGO、CONALA和ATIS。与一些典型模型相比,Tx(Tree-Tree)既提高了准确率,又提高了BLEU。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Code Generation Method based on Structured Tree Input and AST Decoder Attention Augmentation
Automatic code generation based on natural language input is important to research in the field of software engineering. In the past, it was mostly a seq2seq structure and used the RNN model. Input and output are regarded as simple sequences, and syntactic structure information in source information is often ignored. This paper proposes a code generation method Tx(Tree-Tree). It uses structured trees to replace simple word sequences so that the model can better learn the syntactic and semantic information in the source information. Therefore, it can alleviate the long dependency problem caused by too long source information. At the same time, the enhanced attention mechanism is adopted in the decoder to distinguish the influence of different historical actions on the current predicted action. The model is validated on three datasets: DJANGO, CONALA, and ATIS. Compared with some typical models, Tx(Tree-Tree) improves both accuracy and BLEU.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信