Test-Driven Multi-Task Learning with Functionally Equivalent Code Transformation for Neural Code Generation

Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering Pub Date : 2022-10-10 DOI:10.1145/3551349.3559549

Xin Wang, Xiao Liu, Pingyi Zhou, Qixia Liu, Jin Liu, Hao Wu, Xiao Cui

{"title":"Test-Driven Multi-Task Learning with Functionally Equivalent Code Transformation for Neural Code Generation","authors":"Xin Wang, Xiao Liu, Pingyi Zhou, Qixia Liu, Jin Liu, Hao Wu, Xiao Cui","doi":"10.1145/3551349.3559549","DOIUrl":null,"url":null,"abstract":"Automated code generation is a longstanding challenge in both communities of software engineering and artificial intelligence. Currently, some works have started to investigate the functional correctness of code generation, where a code snippet is considered correct if it passes a set of test cases. However, most existing works still model code generation as text generation without considering program-specific information, such as functionally equivalent code snippets and test execution feedback. To address the above limitations, this paper proposes a method combining program analysis with deep learning for neural code generation, where functionally equivalent code snippets and test execution feedback will be considered at the training stage. Concretely, we firstly design several code transformation heuristics to produce different variants of the code snippet satisfying the same functionality. In addition, we employ the test execution feedback and design a test-driven discriminative task to train a novel discriminator, aiming to let the model distinguish whether the generated code is correct or not. The preliminary results on a newly published dataset demonstrate the effectiveness of our proposed framework for code generation. Particularly, in terms of the pass@1 metric, we achieve 8.81 and 11.53 gains compared with CodeGPT and CodeT5, respectively.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3551349.3559549","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Automated code generation is a longstanding challenge in both communities of software engineering and artificial intelligence. Currently, some works have started to investigate the functional correctness of code generation, where a code snippet is considered correct if it passes a set of test cases. However, most existing works still model code generation as text generation without considering program-specific information, such as functionally equivalent code snippets and test execution feedback. To address the above limitations, this paper proposes a method combining program analysis with deep learning for neural code generation, where functionally equivalent code snippets and test execution feedback will be considered at the training stage. Concretely, we firstly design several code transformation heuristics to produce different variants of the code snippet satisfying the same functionality. In addition, we employ the test execution feedback and design a test-driven discriminative task to train a novel discriminator, aiming to let the model distinguish whether the generated code is correct or not. The preliminary results on a newly published dataset demonstrate the effectiveness of our proposed framework for code generation. Particularly, in terms of the pass@1 metric, we achieve 8.81 and 11.53 gains compared with CodeGPT and CodeT5, respectively.

查看原文本刊更多论文

基于功能等效代码转换的神经代码生成测试驱动多任务学习

自动代码生成在软件工程和人工智能社区中都是一个长期存在的挑战。目前，一些工作已经开始调查代码生成的功能正确性，如果一个代码片段通过了一组测试用例，它就被认为是正确的。然而，大多数现有的工作仍然将代码生成建模为文本生成，而不考虑特定于程序的信息，例如功能等效的代码片段和测试执行反馈。为了解决上述局限性，本文提出了一种将程序分析与深度学习相结合的神经代码生成方法，其中在训练阶段将考虑功能等效的代码片段和测试执行反馈。具体来说，我们首先设计了几个代码转换启发式算法来生成满足相同功能的代码片段的不同变体。此外，我们利用测试执行反馈，设计了一个测试驱动的判别任务来训练一个新的判别器，旨在让模型区分生成的代码是否正确。在一个新发布的数据集上的初步结果证明了我们提出的代码生成框架的有效性。特别是，在pass@1度量方面，与CodeGPT和CodeT5相比，我们分别获得了8.81和11.53的增益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

自引率

0.00%

发文量