Hirohisa Aman, Takashi Sasaki, S. Amasaki, Minoru Kawahara
{"title":"Empirical analysis of comments and fault-proneness in methods: can comments point to faulty methods?","authors":"Hirohisa Aman, Takashi Sasaki, S. Amasaki, Minoru Kawahara","doi":"10.1145/2652524.2652592","DOIUrl":null,"url":null,"abstract":"[Context]\n Comments improve the readability of programs, so they are harmless to the software quality. However, comments may sometimes be added to compensate for the lack of readability in complicated programs. Some programmers want to add in-depth comments to their code fragments which are hard to be understood by other developers. In the field of code refactoring, well-written comments are known as artifacts related to \"code smells.\" While well-written comments themselves are harmless, they can play roles as \"deodorant\" beside bad source code.\n [Goal]\n The goal of this paper is to show a notable relationship between the fault-proneness and the commenting manner in methods declared in Java classes.\n [Method]\n We focused on two types of comments: \"documentation comments\" and \"inner comments.\" Documentation comments are ones followed by a method declaration, and inner comments are ones described inside a method body. We collected the following data from some major open source products: (1) Lines of Inner comments (LOI), (2) Lines of Documentation comments (LOD), and (3) Lines of Code (LOC), for each method appeared in their source files. [Method:Anslysis-1] Compared the ratios of faulty methods between sets of methods; Case 1: \"LOI = 0 vs LOI > 0,\" and Case 2: \"LOD = 0 vs LOD > 0.\" [Method:Analysis-2] Compared the ratios of faulty methods by some categories of lines of comments.\n [Results:Analysis-1]\n For all products, the methods having one or more inner comments are about two or three times likely to be faulty than the ones having no inner comments. Therefore, there is a trend which a method with inner comments is more faulty than a non-commented method. On the other hand, the presence or absence of documentation comments did not show no specific tendency.\n Since a larger program tends to be more faulty, we analyzed the effect of code size (LOC) on the fault-proneness as well. We thus performed the logistic regression analysis with using all of LOC, LOI and LOD, in order to take their impacts apart. The result signified that the comparisons using LOI and/or LOD are not dominated by the code size (LOC). That is to say, it is worthwhile to see associations of comments with fault-proneness.\n [Results:Analysis-2]\n The ratios of faulty methods monotonically increased from LOI = 0 to LOI = 2. The ratios in LOI = 3 and LOI ≥ 4 are not changed monotonically but both of them keep higher levels than LOI = 0.\n [Conclusions]\n Our results revealed a novel finding that even one or two inner comments can point to faulty methods. That is to say, if a programmer wants to add some comments onto their code in a method, it may be a sign of poor quality code and the method has a higher potential to be faulty.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2652524.2652592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
[Context]
Comments improve the readability of programs, so they are harmless to the software quality. However, comments may sometimes be added to compensate for the lack of readability in complicated programs. Some programmers want to add in-depth comments to their code fragments which are hard to be understood by other developers. In the field of code refactoring, well-written comments are known as artifacts related to "code smells." While well-written comments themselves are harmless, they can play roles as "deodorant" beside bad source code.
[Goal]
The goal of this paper is to show a notable relationship between the fault-proneness and the commenting manner in methods declared in Java classes.
[Method]
We focused on two types of comments: "documentation comments" and "inner comments." Documentation comments are ones followed by a method declaration, and inner comments are ones described inside a method body. We collected the following data from some major open source products: (1) Lines of Inner comments (LOI), (2) Lines of Documentation comments (LOD), and (3) Lines of Code (LOC), for each method appeared in their source files. [Method:Anslysis-1] Compared the ratios of faulty methods between sets of methods; Case 1: "LOI = 0 vs LOI > 0," and Case 2: "LOD = 0 vs LOD > 0." [Method:Analysis-2] Compared the ratios of faulty methods by some categories of lines of comments.
[Results:Analysis-1]
For all products, the methods having one or more inner comments are about two or three times likely to be faulty than the ones having no inner comments. Therefore, there is a trend which a method with inner comments is more faulty than a non-commented method. On the other hand, the presence or absence of documentation comments did not show no specific tendency.
Since a larger program tends to be more faulty, we analyzed the effect of code size (LOC) on the fault-proneness as well. We thus performed the logistic regression analysis with using all of LOC, LOI and LOD, in order to take their impacts apart. The result signified that the comparisons using LOI and/or LOD are not dominated by the code size (LOC). That is to say, it is worthwhile to see associations of comments with fault-proneness.
[Results:Analysis-2]
The ratios of faulty methods monotonically increased from LOI = 0 to LOI = 2. The ratios in LOI = 3 and LOI ≥ 4 are not changed monotonically but both of them keep higher levels than LOI = 0.
[Conclusions]
Our results revealed a novel finding that even one or two inner comments can point to faulty methods. That is to say, if a programmer wants to add some comments onto their code in a method, it may be a sign of poor quality code and the method has a higher potential to be faulty.