Empirical analysis of comments and fault-proneness in methods: can comments point to faulty methods?

International Symposium on Empirical Software Engineering and Measurement Pub Date : 2014-09-18 DOI:10.1145/2652524.2652592

Hirohisa Aman, Takashi Sasaki, S. Amasaki, Minoru Kawahara

{"title":"Empirical analysis of comments and fault-proneness in methods: can comments point to faulty methods?","authors":"Hirohisa Aman, Takashi Sasaki, S. Amasaki, Minoru Kawahara","doi":"10.1145/2652524.2652592","DOIUrl":null,"url":null,"abstract":"[Context]\n Comments improve the readability of programs, so they are harmless to the software quality. However, comments may sometimes be added to compensate for the lack of readability in complicated programs. Some programmers want to add in-depth comments to their code fragments which are hard to be understood by other developers. In the field of code refactoring, well-written comments are known as artifacts related to \"code smells.\" While well-written comments themselves are harmless, they can play roles as \"deodorant\" beside bad source code.\n [Goal]\n The goal of this paper is to show a notable relationship between the fault-proneness and the commenting manner in methods declared in Java classes.\n [Method]\n We focused on two types of comments: \"documentation comments\" and \"inner comments.\" Documentation comments are ones followed by a method declaration, and inner comments are ones described inside a method body. We collected the following data from some major open source products: (1) Lines of Inner comments (LOI), (2) Lines of Documentation comments (LOD), and (3) Lines of Code (LOC), for each method appeared in their source files. [Method:Anslysis-1] Compared the ratios of faulty methods between sets of methods; Case 1: \"LOI = 0 vs LOI > 0,\" and Case 2: \"LOD = 0 vs LOD > 0.\" [Method:Analysis-2] Compared the ratios of faulty methods by some categories of lines of comments.\n [Results:Analysis-1]\n For all products, the methods having one or more inner comments are about two or three times likely to be faulty than the ones having no inner comments. Therefore, there is a trend which a method with inner comments is more faulty than a non-commented method. On the other hand, the presence or absence of documentation comments did not show no specific tendency.\n Since a larger program tends to be more faulty, we analyzed the effect of code size (LOC) on the fault-proneness as well. We thus performed the logistic regression analysis with using all of LOC, LOI and LOD, in order to take their impacts apart. The result signified that the comparisons using LOI and/or LOD are not dominated by the code size (LOC). That is to say, it is worthwhile to see associations of comments with fault-proneness.\n [Results:Analysis-2]\n The ratios of faulty methods monotonically increased from LOI = 0 to LOI = 2. The ratios in LOI = 3 and LOI ≥ 4 are not changed monotonically but both of them keep higher levels than LOI = 0.\n [Conclusions]\n Our results revealed a novel finding that even one or two inner comments can point to faulty methods. That is to say, if a programmer wants to add some comments onto their code in a method, it may be a sign of poor quality code and the method has a higher potential to be faulty.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2652524.2652592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

[Context] Comments improve the readability of programs, so they are harmless to the software quality. However, comments may sometimes be added to compensate for the lack of readability in complicated programs. Some programmers want to add in-depth comments to their code fragments which are hard to be understood by other developers. In the field of code refactoring, well-written comments are known as artifacts related to "code smells." While well-written comments themselves are harmless, they can play roles as "deodorant" beside bad source code. [Goal] The goal of this paper is to show a notable relationship between the fault-proneness and the commenting manner in methods declared in Java classes. [Method] We focused on two types of comments: "documentation comments" and "inner comments." Documentation comments are ones followed by a method declaration, and inner comments are ones described inside a method body. We collected the following data from some major open source products: (1) Lines of Inner comments (LOI), (2) Lines of Documentation comments (LOD), and (3) Lines of Code (LOC), for each method appeared in their source files. [Method:Anslysis-1] Compared the ratios of faulty methods between sets of methods; Case 1: "LOI = 0 vs LOI > 0," and Case 2: "LOD = 0 vs LOD > 0." [Method:Analysis-2] Compared the ratios of faulty methods by some categories of lines of comments. [Results:Analysis-1] For all products, the methods having one or more inner comments are about two or three times likely to be faulty than the ones having no inner comments. Therefore, there is a trend which a method with inner comments is more faulty than a non-commented method. On the other hand, the presence or absence of documentation comments did not show no specific tendency. Since a larger program tends to be more faulty, we analyzed the effect of code size (LOC) on the fault-proneness as well. We thus performed the logistic regression analysis with using all of LOC, LOI and LOD, in order to take their impacts apart. The result signified that the comparisons using LOI and/or LOD are not dominated by the code size (LOC). That is to say, it is worthwhile to see associations of comments with fault-proneness. [Results:Analysis-2] The ratios of faulty methods monotonically increased from LOI = 0 to LOI = 2. The ratios in LOI = 3 and LOI ≥ 4 are not changed monotonically but both of them keep higher levels than LOI = 0. [Conclusions] Our results revealed a novel finding that even one or two inner comments can point to faulty methods. That is to say, if a programmer wants to add some comments onto their code in a method, it may be a sign of poor quality code and the method has a higher potential to be faulty.

查看原文本刊更多论文

评语与方法错误倾向的实证分析:评语能指出有缺陷的方法吗?

【背景信息】注释提高了程序的可读性，对软件质量无害。然而，在复杂的程序中，有时可能会添加注释来弥补可读性的不足。一些程序员想要在他们的代码片段中添加深度注释，这些注释很难被其他开发人员理解。在代码重构领域，编写良好的注释被称为与“代码气味”相关的工件。虽然写得好的注释本身是无害的，但它们可以在糟糕的源代码旁边扮演“除臭剂”的角色。【目标】本文的目标是展示在Java类中声明的方法中的错误倾向和注释方式之间的显著关系。[方法]我们关注两种类型的注释:“文档注释”和“内部注释”。文档注释是紧跟在方法声明后面的注释，而内部注释是在方法体内部描述的注释。我们从一些主要的开源产品中收集了以下数据:(1)内部注释行(LOI)，(2)文档注释行(LOD)，(3)代码行(LOC)，每个方法都出现在它们的源文件中。[方法:analysis -1]比较两组方法之间的错误方法比例;案例1:“LOI = 0 vs LOI > 0”和案例2:“LOD = 0 vs LOD > 0”。[方法:分析-2]比较不同类别评语行错误方法的比例。[结果:分析-1]对于所有产品，有一个或多个内部注释的方法比没有内部注释的方法出错的可能性高2 - 3倍。因此，有一种趋势是带有内部注释的方法比没有注释的方法更容易出错。另一方面，有无文件评论并没有显示出具体的趋势。由于较大的程序往往有更多的错误，我们也分析了代码大小(LOC)对错误倾向的影响。因此，我们使用所有LOC, LOI和LOD进行逻辑回归分析，以便将它们的影响分开。结果表明，使用LOI和/或LOD的比较不受代码大小(LOC)的支配。也就是说，观察评论与犯错倾向之间的联系是值得的。[结果:分析-2]从LOI = 0到LOI = 2，错误方法的比例单调增加。在LOI = 3和LOI≥4时，两者的比值不是单调变化的，但都比LOI = 0时保持较高的水平。[结论]我们的研究结果揭示了一个新的发现，即使是一两个内部评论也可以指向错误的方法。也就是说，如果程序员想要在方法中的代码中添加一些注释，这可能是代码质量差的标志，并且该方法更有可能出错。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Symposium on Empirical Software Engineering and Measurement

自引率

0.00%

发文量