Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models

2021 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2021-06-26 DOI:10.26226/morressier.613b5419842293c031b5b638

Jingxuan Li, Rui Huang, Wei Li, Kai-Lang Yao, Weiguo Tan

{"title":"Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models","authors":"Jingxuan Li, Rui Huang, Wei Li, Kai-Lang Yao, Weiguo Tan","doi":"10.26226/morressier.613b5419842293c031b5b638","DOIUrl":null,"url":null,"abstract":"Code completion is widely used by software developers to provide coding suggestions given a partially written code snippet. Apart from the traditional code completion methods, which only support single token completion at minimal positions, recent studies show the ability to provide longer code completion at more flexible positions. However, such frequently triggered and longer completion results reduce the overall precision as they generate more invalid results. Moreover, different studies are mostly incompatible with each other. Thus, it is vital to develop an ensemble framework that can combine results from multiple models to draw merits and offset defects of each model. This paper conducts a coding simulation to collect data from code context and different code completion models and then apply the data in two tasks. First, we introduce an acceptance model which can dynamically control whether to display completion results to the developer. It uses simulation features to predict whether correct results exist in the output of these models. Our best model reduces the percentage of false-positive completion from 55.09% to 17.44%. Second, we design a fusion ranking scheme that can automatically identify the priority of the completion results and reorder the candidates from multiple code completion models. This scheme is flexible in dealing with various models, regardless of the type or the length of their completion results. We integrate this ranking scheme with two frequency models and a GPT-2 styled language model, along with the acceptance model to yield 27.80% and 37.64% increase in TOP1 and TOP5 accuracy, respectively. In addition, we propose a new code completion evaluation metric, Benefit-Cost Ratio(BCR), taking into account the benefit of keystrokes saving and hidden cost of completion list browsing, which is closer to real coder experience scenario.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26226/morressier.613b5419842293c031b5b638","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Code completion is widely used by software developers to provide coding suggestions given a partially written code snippet. Apart from the traditional code completion methods, which only support single token completion at minimal positions, recent studies show the ability to provide longer code completion at more flexible positions. However, such frequently triggered and longer completion results reduce the overall precision as they generate more invalid results. Moreover, different studies are mostly incompatible with each other. Thus, it is vital to develop an ensemble framework that can combine results from multiple models to draw merits and offset defects of each model. This paper conducts a coding simulation to collect data from code context and different code completion models and then apply the data in two tasks. First, we introduce an acceptance model which can dynamically control whether to display completion results to the developer. It uses simulation features to predict whether correct results exist in the output of these models. Our best model reduces the percentage of false-positive completion from 55.09% to 17.44%. Second, we design a fusion ranking scheme that can automatically identify the priority of the completion results and reorder the candidates from multiple code completion models. This scheme is flexible in dealing with various models, regardless of the type or the length of their completion results. We integrate this ranking scheme with two frequency models and a GPT-2 styled language model, along with the acceptance model to yield 27.80% and 37.64% increase in TOP1 and TOP5 accuracy, respectively. In addition, we propose a new code completion evaluation metric, Benefit-Cost Ratio(BCR), taking into account the benefit of keystrokes saving and hidden cost of completion list browsing, which is closer to real coder experience scenario.

查看原文本刊更多论文

用验收和排序模型减少代码完成的隐性成本

代码补全被软件开发人员广泛用于在给定部分编写的代码片段时提供编码建议。除了传统的代码补全方法(只支持最小位置的单个令牌补全)之外，最近的研究表明能够在更灵活的位置提供更长的代码补全。然而，这种频繁触发和较长的补全结果会降低整体精度，因为它们会生成更多无效结果。此外，不同的研究大多互不相容。因此，开发一个集成框架是至关重要的，该框架可以将来自多个模型的结果结合起来，以得出每个模型的优点并抵消每个模型的缺陷。本文通过编码仿真，从代码上下文和不同的代码完成模型中收集数据，然后将这些数据应用到两个任务中。首先，我们引入了一个可动态控制是否向开发人员显示完成结果的验收模型。它利用仿真特征来预测这些模型的输出是否存在正确的结果。我们的最佳模型将假阳性完井率从55.09%降低到17.44%。其次，我们设计了一种融合排序方案，可以自动识别完成结果的优先级，并从多个代码完成模型中重新排序候选代码。该方案在处理各种模型时非常灵活，无论其完成结果的类型或长度如何。我们将该排序方案与两个频率模型和一个GPT-2风格的语言模型以及接受模型相结合，TOP1和TOP5的准确率分别提高了27.80%和37.64%。此外，我们提出了一个新的代码完成评估指标，即收益成本比(BCR)，该指标考虑了节省按键的好处和浏览完成列表的隐藏成本，更接近真实的编码体验场景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)

自引率

0.00%

发文量