FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

Jinhao Dong, Yiling Lou, Qihao Zhu, Zeyu Sun, Zhilin Li, Wenjie Zhang, Dan Hao
{"title":"FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation","authors":"Jinhao Dong, Yiling Lou, Qihao Zhu, Zeyu Sun, Zhilin Li, Wenjie Zhang, Dan Hao","doi":"10.1145/3510003.3510069","DOIUrl":null,"url":null,"abstract":"Commit messages summarize code changes of each commit in nat-ural language, which help developers understand code changes without digging into detailed implementations and play an essen-tial role in comprehending software evolution. To alleviate human efforts in writing commit messages, researchers have proposed var-ious automated techniques to generate commit messages, including template-based, information retrieval-based, and learning-based techniques. Although promising, previous techniques have limited effectiveness due to their coarse-grained code change representations. This work proposes a novel commit message generation technique, FIRA, which first represents code changes via fine-grained graphs and then learns to generate commit messages automati-cally. Different from previous techniques, FIRA represents the code changes with fine-grained graphs, which explicitly describe the code edit operations between the old version and the new version, and code tokens at different granularities (i.e., sub-tokens and integral tokens). Based on the graph-based representation, FIRA generates commit messages by a generation model, which includes a graph-neural-network-based encoder and a transformer-based decoder. To make both sub-tokens and integral tokens as available ingredients for commit message generation, the decoder is further incorporated with a novel dual copy mechanism. We further per-form an extensive study to evaluate the effectiveness of FIRA. Our quantitative results show that FIRA outperforms state-of-the-art techniques in terms of BLEU, ROUGE-L, and METEOR; and our ablation analysis further shows that major components in our technique both positively contribute to the effectiveness of FIRA. In addition, we further perform a human study to evaluate the quality of generated commit messages from the perspective of developers, and the results consistently show the effectiveness of FIRA over the compared techniques.","PeriodicalId":202896,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510003.3510069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

Abstract

Commit messages summarize code changes of each commit in nat-ural language, which help developers understand code changes without digging into detailed implementations and play an essen-tial role in comprehending software evolution. To alleviate human efforts in writing commit messages, researchers have proposed var-ious automated techniques to generate commit messages, including template-based, information retrieval-based, and learning-based techniques. Although promising, previous techniques have limited effectiveness due to their coarse-grained code change representations. This work proposes a novel commit message generation technique, FIRA, which first represents code changes via fine-grained graphs and then learns to generate commit messages automati-cally. Different from previous techniques, FIRA represents the code changes with fine-grained graphs, which explicitly describe the code edit operations between the old version and the new version, and code tokens at different granularities (i.e., sub-tokens and integral tokens). Based on the graph-based representation, FIRA generates commit messages by a generation model, which includes a graph-neural-network-based encoder and a transformer-based decoder. To make both sub-tokens and integral tokens as available ingredients for commit message generation, the decoder is further incorporated with a novel dual copy mechanism. We further per-form an extensive study to evaluate the effectiveness of FIRA. Our quantitative results show that FIRA outperforms state-of-the-art techniques in terms of BLEU, ROUGE-L, and METEOR; and our ablation analysis further shows that major components in our technique both positively contribute to the effectiveness of FIRA. In addition, we further perform a human study to evaluate the quality of generated commit messages from the perspective of developers, and the results consistently show the effectiveness of FIRA over the compared techniques.
FIRA:用于自动提交消息生成的细粒度基于图的代码更改表示
提交消息用自然语言总结了每次提交的代码更改,这有助于开发人员在不深入研究详细实现的情况下理解代码更改,并且在理解软件演进中发挥重要作用。为了减少编写提交消息的人力,研究人员提出了各种自动化技术来生成提交消息,包括基于模板的、基于信息检索的和基于学习的技术。虽然以前的技术很有前途,但由于其粗粒度的代码更改表示,其有效性有限。这项工作提出了一种新的提交消息生成技术,FIRA,它首先通过细粒度图表示代码更改,然后学习自动生成提交消息。与以前的技术不同,FIRA用细粒度的图形表示代码变化,它明确地描述了旧版本和新版本之间的代码编辑操作,以及不同粒度的代码标记(即子标记和积分标记)。基于基于图的表示,FIRA通过生成模型生成提交消息,该模型包括基于图神经网络的编码器和基于变压器的解码器。为了使子令牌和积分令牌都成为提交消息生成的可用成分,解码器进一步结合了一种新的双复制机制。我们进一步进行了一项广泛的研究来评估FIRA的有效性。我们的定量结果表明,FIRA在BLEU、ROUGE-L和METEOR方面优于最先进的技术;我们的消融分析进一步表明,我们技术中的主要成分都对FIRA的有效性有积极的贡献。此外,我们进一步进行了一项人类研究,从开发人员的角度评估生成的提交消息的质量,结果一致地表明FIRA优于比较的技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信