Ya-Ping Liu, Zhiqiu Huang, Yaoshen Yu, Yasir Hussain, Lile Lin
{"title":"通过序列特征和结构特征改进代码完成性","authors":"Ya-Ping Liu, Zhiqiu Huang, Yaoshen Yu, Yasir Hussain, Lile Lin","doi":"10.1145/3568364.3568373","DOIUrl":null,"url":null,"abstract":"Code completion is essential in integrated development environments (IDEs). It has also shown intelligence in helping developers to product. Recently, neural network-based models have helped improve code completion by capturing code information from the abstract syntax tree (AST). However, these methods suffer from several issues. First, the code sequence features are not fully exploited. Second, the sequence features are not effectively combined and utilized with structural features. In this paper, we explore the effectiveness of code sequence features using relative position encoding at first. Then we combine the sequence features with structural features using an extended attention mechanism to enhance performance. We evaluate the proposed approach in two real-world datasets and find that sequence features are practically crucial for code completion. The sequence features combined with structural features enhance the code completion performance. Also, we employ Byte-Pair Encoding (BPE) to mitigate the out-of-vocabulary (OOV) issue in this task. Our best model has a 10% improvement for the mean reciprocal rank (MRR) metric compared to previous researches.","PeriodicalId":262799,"journal":{"name":"Proceedings of the 4th World Symposium on Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving Code Completion by Sequence Features and Structural Features\",\"authors\":\"Ya-Ping Liu, Zhiqiu Huang, Yaoshen Yu, Yasir Hussain, Lile Lin\",\"doi\":\"10.1145/3568364.3568373\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code completion is essential in integrated development environments (IDEs). It has also shown intelligence in helping developers to product. Recently, neural network-based models have helped improve code completion by capturing code information from the abstract syntax tree (AST). However, these methods suffer from several issues. First, the code sequence features are not fully exploited. Second, the sequence features are not effectively combined and utilized with structural features. In this paper, we explore the effectiveness of code sequence features using relative position encoding at first. Then we combine the sequence features with structural features using an extended attention mechanism to enhance performance. We evaluate the proposed approach in two real-world datasets and find that sequence features are practically crucial for code completion. The sequence features combined with structural features enhance the code completion performance. Also, we employ Byte-Pair Encoding (BPE) to mitigate the out-of-vocabulary (OOV) issue in this task. Our best model has a 10% improvement for the mean reciprocal rank (MRR) metric compared to previous researches.\",\"PeriodicalId\":262799,\"journal\":{\"name\":\"Proceedings of the 4th World Symposium on Software Engineering\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th World Symposium on Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3568364.3568373\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th World Symposium on Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3568364.3568373","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Code Completion by Sequence Features and Structural Features
Code completion is essential in integrated development environments (IDEs). It has also shown intelligence in helping developers to product. Recently, neural network-based models have helped improve code completion by capturing code information from the abstract syntax tree (AST). However, these methods suffer from several issues. First, the code sequence features are not fully exploited. Second, the sequence features are not effectively combined and utilized with structural features. In this paper, we explore the effectiveness of code sequence features using relative position encoding at first. Then we combine the sequence features with structural features using an extended attention mechanism to enhance performance. We evaluate the proposed approach in two real-world datasets and find that sequence features are practically crucial for code completion. The sequence features combined with structural features enhance the code completion performance. Also, we employ Byte-Pair Encoding (BPE) to mitigate the out-of-vocabulary (OOV) issue in this task. Our best model has a 10% improvement for the mean reciprocal rank (MRR) metric compared to previous researches.