通过序列特征和结构特征改进代码完成性

Ya-Ping Liu, Zhiqiu Huang, Yaoshen Yu, Yasir Hussain, Lile Lin
{"title":"通过序列特征和结构特征改进代码完成性","authors":"Ya-Ping Liu, Zhiqiu Huang, Yaoshen Yu, Yasir Hussain, Lile Lin","doi":"10.1145/3568364.3568373","DOIUrl":null,"url":null,"abstract":"Code completion is essential in integrated development environments (IDEs). It has also shown intelligence in helping developers to product. Recently, neural network-based models have helped improve code completion by capturing code information from the abstract syntax tree (AST). However, these methods suffer from several issues. First, the code sequence features are not fully exploited. Second, the sequence features are not effectively combined and utilized with structural features. In this paper, we explore the effectiveness of code sequence features using relative position encoding at first. Then we combine the sequence features with structural features using an extended attention mechanism to enhance performance. We evaluate the proposed approach in two real-world datasets and find that sequence features are practically crucial for code completion. The sequence features combined with structural features enhance the code completion performance. Also, we employ Byte-Pair Encoding (BPE) to mitigate the out-of-vocabulary (OOV) issue in this task. Our best model has a 10% improvement for the mean reciprocal rank (MRR) metric compared to previous researches.","PeriodicalId":262799,"journal":{"name":"Proceedings of the 4th World Symposium on Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving Code Completion by Sequence Features and Structural Features\",\"authors\":\"Ya-Ping Liu, Zhiqiu Huang, Yaoshen Yu, Yasir Hussain, Lile Lin\",\"doi\":\"10.1145/3568364.3568373\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code completion is essential in integrated development environments (IDEs). It has also shown intelligence in helping developers to product. Recently, neural network-based models have helped improve code completion by capturing code information from the abstract syntax tree (AST). However, these methods suffer from several issues. First, the code sequence features are not fully exploited. Second, the sequence features are not effectively combined and utilized with structural features. In this paper, we explore the effectiveness of code sequence features using relative position encoding at first. Then we combine the sequence features with structural features using an extended attention mechanism to enhance performance. We evaluate the proposed approach in two real-world datasets and find that sequence features are practically crucial for code completion. The sequence features combined with structural features enhance the code completion performance. Also, we employ Byte-Pair Encoding (BPE) to mitigate the out-of-vocabulary (OOV) issue in this task. Our best model has a 10% improvement for the mean reciprocal rank (MRR) metric compared to previous researches.\",\"PeriodicalId\":262799,\"journal\":{\"name\":\"Proceedings of the 4th World Symposium on Software Engineering\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th World Symposium on Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3568364.3568373\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th World Symposium on Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3568364.3568373","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

代码完成在集成开发环境(ide)中是必不可少的。它在帮助开发人员开发产品方面也表现出了智慧。最近,基于神经网络的模型通过从抽象语法树(AST)中捕获代码信息来帮助改进代码完成。然而,这些方法有几个问题。首先,代码序列特征没有得到充分利用。二是层序特征与构造特征没有有效结合和利用。本文首先探讨了采用相对位置编码的编码序列特征的有效性。然后,我们使用扩展注意机制将序列特征与结构特征结合起来以提高性能。我们在两个真实世界的数据集中评估了所提出的方法,发现序列特征对于代码完成实际上是至关重要的。序列特性与结构特性的结合提高了代码完成性能。此外,我们采用字节对编码(BPE)来缓解此任务中的词汇表外问题。与以前的研究相比,我们的最佳模型对平均倒数秩(MRR)度量有10%的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving Code Completion by Sequence Features and Structural Features
Code completion is essential in integrated development environments (IDEs). It has also shown intelligence in helping developers to product. Recently, neural network-based models have helped improve code completion by capturing code information from the abstract syntax tree (AST). However, these methods suffer from several issues. First, the code sequence features are not fully exploited. Second, the sequence features are not effectively combined and utilized with structural features. In this paper, we explore the effectiveness of code sequence features using relative position encoding at first. Then we combine the sequence features with structural features using an extended attention mechanism to enhance performance. We evaluate the proposed approach in two real-world datasets and find that sequence features are practically crucial for code completion. The sequence features combined with structural features enhance the code completion performance. Also, we employ Byte-Pair Encoding (BPE) to mitigate the out-of-vocabulary (OOV) issue in this task. Our best model has a 10% improvement for the mean reciprocal rank (MRR) metric compared to previous researches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信