代码-注释不一致性的大规模实证研究

Fengcai Wen, Csaba Nagy, G. Bavota, Michele Lanza
{"title":"代码-注释不一致性的大规模实证研究","authors":"Fengcai Wen, Csaba Nagy, G. Bavota, Michele Lanza","doi":"10.1109/ICPC.2019.00019","DOIUrl":null,"url":null,"abstract":"Code comments are a primary means to document source code. Keeping comments up-to-date during code change activities requires substantial time and attention. For this reason, researchers have proposed methods to detect code-comment inconsistencies (i.e., comments that are not kept in sync with the code they document) and studies have been conducted to investigate this phenomenon. However, these studies were performed at a small scale, relying on quantitative analysis, thus limiting the empirical knowledge about code-comment inconsistencies. We present the largest study at date investigating how code and comments co-evolve. The study has been performed by mining 1.3 Billion AST-level changes from the complete history of 1,500 systems. Moreover, we manually analyzed 500 commits to define a taxonomy of code-comment inconsistencies fixed by developers. Our analysis discloses the extent to which different types of code changes (e.g., change of selection statements) trigger updates to the related comments, identifying cases in which code-comment inconsistencies are more likely to be introduced. The defined taxonomy categorizes the types of inconsistencies fixed by developers. Our results can guide the development of tools aimed at detecting and fixing code-comment inconsistencies.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"54 1","pages":"53-64"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":"{\"title\":\"A Large-Scale Empirical Study on Code-Comment Inconsistencies\",\"authors\":\"Fengcai Wen, Csaba Nagy, G. Bavota, Michele Lanza\",\"doi\":\"10.1109/ICPC.2019.00019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code comments are a primary means to document source code. Keeping comments up-to-date during code change activities requires substantial time and attention. For this reason, researchers have proposed methods to detect code-comment inconsistencies (i.e., comments that are not kept in sync with the code they document) and studies have been conducted to investigate this phenomenon. However, these studies were performed at a small scale, relying on quantitative analysis, thus limiting the empirical knowledge about code-comment inconsistencies. We present the largest study at date investigating how code and comments co-evolve. The study has been performed by mining 1.3 Billion AST-level changes from the complete history of 1,500 systems. Moreover, we manually analyzed 500 commits to define a taxonomy of code-comment inconsistencies fixed by developers. Our analysis discloses the extent to which different types of code changes (e.g., change of selection statements) trigger updates to the related comments, identifying cases in which code-comment inconsistencies are more likely to be introduced. The defined taxonomy categorizes the types of inconsistencies fixed by developers. Our results can guide the development of tools aimed at detecting and fixing code-comment inconsistencies.\",\"PeriodicalId\":6853,\"journal\":{\"name\":\"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)\",\"volume\":\"54 1\",\"pages\":\"53-64\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"63\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPC.2019.00019\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPC.2019.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 63

摘要

代码注释是记录源代码的主要方法。在代码更改活动中保持注释的更新需要大量的时间和精力。由于这个原因,研究人员已经提出了检测代码注释不一致的方法(即,注释与它们所记录的代码不同步),并且已经进行了研究来调查这种现象。然而,这些研究是在小范围内进行的,依赖于定量分析,因此限制了关于代码注释不一致的经验知识。我们提出了迄今为止最大的研究,调查代码和注释是如何共同发展的。这项研究是通过从1500个系统的完整历史中挖掘13亿个ast水平的变化来完成的。此外,我们手工分析了500次提交,以定义由开发人员修复的代码注释不一致的分类。我们的分析揭示了不同类型的代码更改(例如,选择语句的更改)触发相关注释更新的程度,确定了代码-注释不一致更有可能被引入的情况。定义的分类法对开发人员修复的不一致类型进行分类。我们的结果可以指导旨在检测和修复代码注释不一致的工具的开发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Large-Scale Empirical Study on Code-Comment Inconsistencies
Code comments are a primary means to document source code. Keeping comments up-to-date during code change activities requires substantial time and attention. For this reason, researchers have proposed methods to detect code-comment inconsistencies (i.e., comments that are not kept in sync with the code they document) and studies have been conducted to investigate this phenomenon. However, these studies were performed at a small scale, relying on quantitative analysis, thus limiting the empirical knowledge about code-comment inconsistencies. We present the largest study at date investigating how code and comments co-evolve. The study has been performed by mining 1.3 Billion AST-level changes from the complete history of 1,500 systems. Moreover, we manually analyzed 500 commits to define a taxonomy of code-comment inconsistencies fixed by developers. Our analysis discloses the extent to which different types of code changes (e.g., change of selection statements) trigger updates to the related comments, identifying cases in which code-comment inconsistencies are more likely to be introduced. The defined taxonomy categorizes the types of inconsistencies fixed by developers. Our results can guide the development of tools aimed at detecting and fixing code-comment inconsistencies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信