{"title":"基于辅助代码分类任务的源代码神经注释生成","authors":"Minghao Chen, Xiaojun Wan","doi":"10.1109/APSEC48747.2019.00076","DOIUrl":null,"url":null,"abstract":"Code comments help program developers understand programs, read and navigate source code, thus resulting in more efficient software maintenance. Unfortunately, many codes are not commented adequately, or the code comments are missing. So developers have to spend additional time in reading source code. In this paper, we propose a new approach to automatically generating comments for source codes. Following the intuition behind the traditional sequence-to-sequence (Seq2Seq) model for machine translation, we propose a tree-to-sequence (Tree2Seq) model for code comment generation, which leverages an encoder to capture the structure information of source code. More importantly, code classification is involved as an auxiliary task for aiding the Tree2Seq model. We build a multi-task learning model to achieve this goal. We evaluate our models on a benchmark dataset with automatic metrics like BLEU, ROUGE, and METEOR. Experimental results show that our proposed Tree2Seq model outperforms traditional Seq2Seq model with attention, and our proposed multi-task learning model outperforms the state-of-the-art approaches by a substantial margin.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Neural Comment Generation for Source Code with Auxiliary Code Classification Task\",\"authors\":\"Minghao Chen, Xiaojun Wan\",\"doi\":\"10.1109/APSEC48747.2019.00076\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code comments help program developers understand programs, read and navigate source code, thus resulting in more efficient software maintenance. Unfortunately, many codes are not commented adequately, or the code comments are missing. So developers have to spend additional time in reading source code. In this paper, we propose a new approach to automatically generating comments for source codes. Following the intuition behind the traditional sequence-to-sequence (Seq2Seq) model for machine translation, we propose a tree-to-sequence (Tree2Seq) model for code comment generation, which leverages an encoder to capture the structure information of source code. More importantly, code classification is involved as an auxiliary task for aiding the Tree2Seq model. We build a multi-task learning model to achieve this goal. We evaluate our models on a benchmark dataset with automatic metrics like BLEU, ROUGE, and METEOR. Experimental results show that our proposed Tree2Seq model outperforms traditional Seq2Seq model with attention, and our proposed multi-task learning model outperforms the state-of-the-art approaches by a substantial margin.\",\"PeriodicalId\":325642,\"journal\":{\"name\":\"2019 26th Asia-Pacific Software Engineering Conference (APSEC)\",\"volume\":\"80 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 26th Asia-Pacific Software Engineering Conference (APSEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APSEC48747.2019.00076\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC48747.2019.00076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Neural Comment Generation for Source Code with Auxiliary Code Classification Task
Code comments help program developers understand programs, read and navigate source code, thus resulting in more efficient software maintenance. Unfortunately, many codes are not commented adequately, or the code comments are missing. So developers have to spend additional time in reading source code. In this paper, we propose a new approach to automatically generating comments for source codes. Following the intuition behind the traditional sequence-to-sequence (Seq2Seq) model for machine translation, we propose a tree-to-sequence (Tree2Seq) model for code comment generation, which leverages an encoder to capture the structure information of source code. More importantly, code classification is involved as an auxiliary task for aiding the Tree2Seq model. We build a multi-task learning model to achieve this goal. We evaluate our models on a benchmark dataset with automatic metrics like BLEU, ROUGE, and METEOR. Experimental results show that our proposed Tree2Seq model outperforms traditional Seq2Seq model with attention, and our proposed multi-task learning model outperforms the state-of-the-art approaches by a substantial margin.