{"title":"利用搜索-生成-修改功能自动编辑代码","authors":"Changshu Liu;Pelin Cetin;Yogesh Patodia;Baishakhi Ray;Saikat Chakraborty;Yangruibo Ding","doi":"10.1109/TSE.2024.3376387","DOIUrl":null,"url":null,"abstract":"Code editing is essential in evolving software development. In literature, several automated code editing tools are proposed, which leverage Information Retrieval-based techniques and Machine Learning-based code generation and code editing models. Each technique comes with its own promises and perils, and for this reason, they are often used together to complement their strengths and compensate for their weaknesses. This paper proposes a hybrid approach to better synthesize code edits by leveraging the power of code search, generation, and modification. Our key observation is that a patch that is obtained by search & retrieval, even if incorrect, can provide helpful guidance to a code generation model. However, a retrieval-guided patch produced by a code generation model can still be a few tokens off from the intended patch. Such generated patches can be slightly modified to create the intended patches. We developed a novel tool to solve this challenge: \n<sc>SarGaM</small>\n, which is designed to follow a real developer's code editing behavior. Given an original code version, the developer may \n<italic>search</i>\n for the related patches, \n<italic>generate</i>\n or write the code, and then \n<italic>modify</i>\n the generated code to adapt it to the right context. Our evaluation of \n<sc>SarGaM</small>\n on edit generation shows superior performance w.r.t. the current state-of-the-art techniques. \n<sc>SarGaM</small>\n also shows its effectiveness on automated program repair tasks.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":6.5000,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Code Editing With Search-Generate-Modify\",\"authors\":\"Changshu Liu;Pelin Cetin;Yogesh Patodia;Baishakhi Ray;Saikat Chakraborty;Yangruibo Ding\",\"doi\":\"10.1109/TSE.2024.3376387\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code editing is essential in evolving software development. In literature, several automated code editing tools are proposed, which leverage Information Retrieval-based techniques and Machine Learning-based code generation and code editing models. Each technique comes with its own promises and perils, and for this reason, they are often used together to complement their strengths and compensate for their weaknesses. This paper proposes a hybrid approach to better synthesize code edits by leveraging the power of code search, generation, and modification. Our key observation is that a patch that is obtained by search & retrieval, even if incorrect, can provide helpful guidance to a code generation model. However, a retrieval-guided patch produced by a code generation model can still be a few tokens off from the intended patch. Such generated patches can be slightly modified to create the intended patches. We developed a novel tool to solve this challenge: \\n<sc>SarGaM</small>\\n, which is designed to follow a real developer's code editing behavior. Given an original code version, the developer may \\n<italic>search</i>\\n for the related patches, \\n<italic>generate</i>\\n or write the code, and then \\n<italic>modify</i>\\n the generated code to adapt it to the right context. Our evaluation of \\n<sc>SarGaM</small>\\n on edit generation shows superior performance w.r.t. the current state-of-the-art techniques. \\n<sc>SarGaM</small>\\n also shows its effectiveness on automated program repair tasks.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2024-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10482873/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10482873/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Automated Code Editing With Search-Generate-Modify
Code editing is essential in evolving software development. In literature, several automated code editing tools are proposed, which leverage Information Retrieval-based techniques and Machine Learning-based code generation and code editing models. Each technique comes with its own promises and perils, and for this reason, they are often used together to complement their strengths and compensate for their weaknesses. This paper proposes a hybrid approach to better synthesize code edits by leveraging the power of code search, generation, and modification. Our key observation is that a patch that is obtained by search & retrieval, even if incorrect, can provide helpful guidance to a code generation model. However, a retrieval-guided patch produced by a code generation model can still be a few tokens off from the intended patch. Such generated patches can be slightly modified to create the intended patches. We developed a novel tool to solve this challenge:
SarGaM
, which is designed to follow a real developer's code editing behavior. Given an original code version, the developer may
search
for the related patches,
generate
or write the code, and then
modify
the generated code to adapt it to the right context. Our evaluation of
SarGaM
on edit generation shows superior performance w.r.t. the current state-of-the-art techniques.
SarGaM
also shows its effectiveness on automated program repair tasks.
期刊介绍:
IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include:
a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models.
b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects.
c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards.
d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues.
e) System issues: Hardware-software trade-offs.
f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.