为什么我的代码总结模型不起作用

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2021-02-10 DOI:10.1145/3434280

Qiuyuan Chen, Xin Xia, Han Hu, D. Lo, Shanping Li

{"title":"为什么我的代码总结模型不起作用","authors":"Qiuyuan Chen, Xin Xia, Han Hu, D. Lo, Shanping Li","doi":"10.1145/3434280","DOIUrl":null,"url":null,"abstract":"Code summarization aims at generating a code comment given a block of source code and it is normally performed by training machine learning algorithms on existing code block-comment pairs. Code comments in practice have different intentions. For example, some code comments might explain how the methods work, while others explain why some methods are written. Previous works have shown that a relationship exists between a code block and the category of a comment associated with it. In this article, we aim to investigate to which extent we can exploit this relationship to improve code summarization performance. We first classify comments into six intention categories and manually label 20,000 code-comment pairs. These categories include “what,” “why,” “how-to-use,” “how-it-is-done,” “property,” and “others.” Based on this dataset, we conduct an experiment to investigate the performance of different state-of-the-art code summarization approaches on the categories. We find that the performance of different code summarization approaches varies substantially across the categories. Moreover, the category for which a code summarization model performs the best is different for the different models. In particular, no models perform the best for “why” and “property” comments among the six categories. We design a composite approach to demonstrate that comment category prediction can boost code summarization to reach better results. The approach leverages classified code-category labeled data to train a classifier to infer categories. Then it selects the most suitable models for inferred categories and outputs the composite results. Our composite approach outperforms other approaches that do not consider comment categories and obtains a relative improvement of 8.57% and 16.34% in terms of ROUGE-L and BLEU-4 score, respectively.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"22 1","pages":"1 - 29"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":"{\"title\":\"Why My Code Summarization Model Does Not Work\",\"authors\":\"Qiuyuan Chen, Xin Xia, Han Hu, D. Lo, Shanping Li\",\"doi\":\"10.1145/3434280\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code summarization aims at generating a code comment given a block of source code and it is normally performed by training machine learning algorithms on existing code block-comment pairs. Code comments in practice have different intentions. For example, some code comments might explain how the methods work, while others explain why some methods are written. Previous works have shown that a relationship exists between a code block and the category of a comment associated with it. In this article, we aim to investigate to which extent we can exploit this relationship to improve code summarization performance. We first classify comments into six intention categories and manually label 20,000 code-comment pairs. These categories include “what,” “why,” “how-to-use,” “how-it-is-done,” “property,” and “others.” Based on this dataset, we conduct an experiment to investigate the performance of different state-of-the-art code summarization approaches on the categories. We find that the performance of different code summarization approaches varies substantially across the categories. Moreover, the category for which a code summarization model performs the best is different for the different models. In particular, no models perform the best for “why” and “property” comments among the six categories. We design a composite approach to demonstrate that comment category prediction can boost code summarization to reach better results. The approach leverages classified code-category labeled data to train a classifier to infer categories. Then it selects the most suitable models for inferred categories and outputs the composite results. Our composite approach outperforms other approaches that do not consider comment categories and obtains a relative improvement of 8.57% and 16.34% in terms of ROUGE-L and BLEU-4 score, respectively.\",\"PeriodicalId\":7398,\"journal\":{\"name\":\"ACM Transactions on Software Engineering and Methodology (TOSEM)\",\"volume\":\"22 1\",\"pages\":\"1 - 29\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"41\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Software Engineering and Methodology (TOSEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3434280\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology (TOSEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3434280","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 41

摘要

代码摘要旨在生成给定源代码块的代码注释，通常通过在现有代码块-注释对上训练机器学习算法来执行。代码注释在实践中有不同的意图。例如，一些代码注释可能解释方法是如何工作的，而另一些注释解释为什么要编写某些方法。以前的工作已经表明，在代码块和与其关联的注释类别之间存在关系。在本文中，我们的目标是研究我们可以在多大程度上利用这种关系来提高代码汇总性能。我们首先将注释分为六个意图类别，并手动标记20,000个代码-注释对。这些类别包括“什么”、“为什么”、“如何使用”、“如何完成”、“属性”和“其他”。基于此数据集，我们进行了一个实验来研究不同的最先进的代码摘要方法在类别上的性能。我们发现不同代码汇总方法的性能在不同的类别中有很大的不同。此外，对于不同的模型，代码摘要模型表现最好的类别是不同的。特别是，在六个类别中，没有模型对“为什么”和“属性”注释表现得最好。我们设计了一种复合方法来证明注释类别预测可以提高代码摘要以达到更好的结果。该方法利用分类代码-类别标记数据来训练分类器来推断类别。然后为推断的类别选择最合适的模型并输出合成结果。我们的复合方法优于其他不考虑评论类别的方法，在ROUGE-L和BLEU-4得分方面分别获得了8.57%和16.34%的相对改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Why My Code Summarization Model Does Not Work

Code summarization aims at generating a code comment given a block of source code and it is normally performed by training machine learning algorithms on existing code block-comment pairs. Code comments in practice have different intentions. For example, some code comments might explain how the methods work, while others explain why some methods are written. Previous works have shown that a relationship exists between a code block and the category of a comment associated with it. In this article, we aim to investigate to which extent we can exploit this relationship to improve code summarization performance. We first classify comments into six intention categories and manually label 20,000 code-comment pairs. These categories include “what,” “why,” “how-to-use,” “how-it-is-done,” “property,” and “others.” Based on this dataset, we conduct an experiment to investigate the performance of different state-of-the-art code summarization approaches on the categories. We find that the performance of different code summarization approaches varies substantially across the categories. Moreover, the category for which a code summarization model performs the best is different for the different models. In particular, no models perform the best for “why” and “property” comments among the six categories. We design a composite approach to demonstrate that comment category prediction can boost code summarization to reach better results. The approach leverages classified code-category labeled data to train a classifier to infer categories. Then it selects the most suitable models for inferred categories and outputs the composite results. Our composite approach outperforms other approaches that do not consider comment categories and obtains a relative improvement of 8.57% and 16.34% in terms of ROUGE-L and BLEU-4 score, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Software Engineering and Methodology (TOSEM)

自引率

0.00%

发文量