{"title":"A Simple Retrieval-based Method for Code Comment Generation","authors":"Xiaoning Zhu, Chaofeng Sha, Junyu Niu","doi":"10.1109/saner53432.2022.00126","DOIUrl":null,"url":null,"abstract":"Code comments can effectively help developers comprehend programs. However, it is a challenging and time-consuming task to write good comments for source code. Therefore, automatic generation of code comments is a promising research direction. Recently, researchers have leveraged neural machine translation to generate comments from source code and achieved impressive results. Another line of work has tried to exploit information retrieval (IR) techniques and showed excellent performance improvement on this task. However, current retrieval-based methods usually involve complex retrieval and editing operations, which are difficult to implement. To tackle the problems, we propose kNN-Transformer, a simple end-to-end retrieval-based code comment generation method. Our method combines a simple nearest neighbor retrieval module and a powerful transformer-based model. When generating each token, the retrieval module estimates a probability distribution depending on the current translation context rather than obtaining the retrieved samples in advance. The experiment results on four widely used public datasets (two Java datasets and two Python datasets) demonstrate that our method outperforms all the baselines, and our $k$ NN retrieval module brings significant improvement when similar code snippets are available.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"390 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/saner53432.2022.00126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Code comments can effectively help developers comprehend programs. However, it is a challenging and time-consuming task to write good comments for source code. Therefore, automatic generation of code comments is a promising research direction. Recently, researchers have leveraged neural machine translation to generate comments from source code and achieved impressive results. Another line of work has tried to exploit information retrieval (IR) techniques and showed excellent performance improvement on this task. However, current retrieval-based methods usually involve complex retrieval and editing operations, which are difficult to implement. To tackle the problems, we propose kNN-Transformer, a simple end-to-end retrieval-based code comment generation method. Our method combines a simple nearest neighbor retrieval module and a powerful transformer-based model. When generating each token, the retrieval module estimates a probability distribution depending on the current translation context rather than obtaining the retrieved samples in advance. The experiment results on four widely used public datasets (two Java datasets and two Python datasets) demonstrate that our method outperforms all the baselines, and our $k$ NN retrieval module brings significant improvement when similar code snippets are available.