{"title":"Structuring Meaningful Code Changes in Developer Community","authors":"Mengxuan Li, Shikai Guo, X. Ge, Hui Li, Rong Chen","doi":"10.1109/PAAP56126.2022.10010364","DOIUrl":null,"url":null,"abstract":"The rapid development of Open-Source Software (OSS) has resulted in a significant demand for code changes to maintain OSS. Symptoms of poor design and implementation choices in code changes often occur, thus heavily hindering code reviewers to verify correctness and soundness of code changes. Researchers have investigated how to learn meaningful code changes to assist developers in anticipating changes that code reviewers may suggest for the submitted code. However, there are two main limitations to be addressed, including the limitation of long-range dependencies of the source code and the missing syntactic structural information of the source code. To solve these limitations, we propose a novel method named GTCT. GTCT comprises two components: code graph embedding and code transformation learning. To address the missing syntactic structural information, we encoding the source code into a code graph structure from the lexical and syntactic representations of the source code. Subsequently, we uses the multi-head attention mechanism and positional encoding mechanism to address the long-range dependencies limitation. Extensive experiments are conducted to evaluate the performance of GTCT by both quantitative and qualitative analyses. For the quantitative analysis, GTCT relatively outperforms the baseline on six datasets by 210%, 342.86%, 135%, 29.41%, 109.09%, and 91.67% in terms of perfect prediction. Meanwhile, the qualitative analysis shows that each type of code change by GTCT outperforms that of the baseline method in terms of bug fixed, refactoring code, and others’ taxonomy of code changes.","PeriodicalId":336339,"journal":{"name":"2022 IEEE 13th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 13th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PAAP56126.2022.10010364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The rapid development of Open-Source Software (OSS) has resulted in a significant demand for code changes to maintain OSS. Symptoms of poor design and implementation choices in code changes often occur, thus heavily hindering code reviewers to verify correctness and soundness of code changes. Researchers have investigated how to learn meaningful code changes to assist developers in anticipating changes that code reviewers may suggest for the submitted code. However, there are two main limitations to be addressed, including the limitation of long-range dependencies of the source code and the missing syntactic structural information of the source code. To solve these limitations, we propose a novel method named GTCT. GTCT comprises two components: code graph embedding and code transformation learning. To address the missing syntactic structural information, we encoding the source code into a code graph structure from the lexical and syntactic representations of the source code. Subsequently, we uses the multi-head attention mechanism and positional encoding mechanism to address the long-range dependencies limitation. Extensive experiments are conducted to evaluate the performance of GTCT by both quantitative and qualitative analyses. For the quantitative analysis, GTCT relatively outperforms the baseline on six datasets by 210%, 342.86%, 135%, 29.41%, 109.09%, and 91.67% in terms of perfect prediction. Meanwhile, the qualitative analysis shows that each type of code change by GTCT outperforms that of the baseline method in terms of bug fixed, refactoring code, and others’ taxonomy of code changes.