Characterizing and Detecting Anti-Patterns in the Logging Code

Boyuan Chen, Z. Jiang
{"title":"Characterizing and Detecting Anti-Patterns in the Logging Code","authors":"Boyuan Chen, Z. Jiang","doi":"10.1109/ICSE.2017.15","DOIUrl":null,"url":null,"abstract":"Snippets of logging code are output statements (e.g., LOG.info or System.out.println) that developers insert into a software system. Although more logging code can provide more execution context of the system's behavior during runtime, it is undesirable to instrument the system with too much logging code due to maintenance overhead. Furthermore, excessive logging may cause unexpected side-effects like performance slow-down or high disk I/O bandwidth. Recent studies show that there are no well-defined coding guidelines for performing effective logging. Previous research on the logging code mainly tackles the problems of where-to-log and what-to-log. There are very few works trying to address the problem of how-to-log (developing and maintaining high-quality logging code). In this paper, we study the problem of how-to-log by characterizing and detecting the anti-patterns in the logging code. As the majority of the logging code is evolved together with the feature code, the remaining set of logging code changes usually contains the fixes to the anti-patterns. We have manually examined 352 pairs of independently changed logging code snippets from three well-maintenance open source systems: ActiveMQ, Hadoop and Maven. Our analysis has resulted in six different anti-patterns in the logging code. To demonstrate the value of our findings, we have encoded these anti-patterns into a static code analysis tool, LCAnalyzer. Case studies show that LCAnalyzer has an average recall of 95% and precision of 60% and can be used to automatically detect previously unknown anti-patterns in the source code. To gather feedback, we have filed 64 representative instances of the logging code anti-patterns from the most recent releases of ten open source software systems. Among them, 46 instances (72%) have already been accepted by their developers.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"53 1","pages":"71-81"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"90","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2017.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 90

Abstract

Snippets of logging code are output statements (e.g., LOG.info or System.out.println) that developers insert into a software system. Although more logging code can provide more execution context of the system's behavior during runtime, it is undesirable to instrument the system with too much logging code due to maintenance overhead. Furthermore, excessive logging may cause unexpected side-effects like performance slow-down or high disk I/O bandwidth. Recent studies show that there are no well-defined coding guidelines for performing effective logging. Previous research on the logging code mainly tackles the problems of where-to-log and what-to-log. There are very few works trying to address the problem of how-to-log (developing and maintaining high-quality logging code). In this paper, we study the problem of how-to-log by characterizing and detecting the anti-patterns in the logging code. As the majority of the logging code is evolved together with the feature code, the remaining set of logging code changes usually contains the fixes to the anti-patterns. We have manually examined 352 pairs of independently changed logging code snippets from three well-maintenance open source systems: ActiveMQ, Hadoop and Maven. Our analysis has resulted in six different anti-patterns in the logging code. To demonstrate the value of our findings, we have encoded these anti-patterns into a static code analysis tool, LCAnalyzer. Case studies show that LCAnalyzer has an average recall of 95% and precision of 60% and can be used to automatically detect previously unknown anti-patterns in the source code. To gather feedback, we have filed 64 representative instances of the logging code anti-patterns from the most recent releases of ten open source software systems. Among them, 46 instances (72%) have already been accepted by their developers.
日志代码中反模式的表征和检测
日志代码片段是开发人员插入到软件系统中的输出语句(例如,LOG.info或system. out.println)。虽然更多的日志代码可以在运行时提供更多的系统行为的执行上下文,但由于维护开销,不希望使用太多的日志代码来检测系统。此外,过多的日志记录可能会导致意想不到的副作用,如性能下降或磁盘I/O带宽高。最近的研究表明,没有定义良好的编码准则来执行有效的日志记录。以往对日志代码的研究主要是解决在哪里记录和记录什么内容的问题。很少有作品试图解决如何记录日志的问题(开发和维护高质量的日志代码)。本文通过对日志代码中的反模式进行表征和检测,研究如何进行日志记录的问题。由于大多数日志代码与特性代码一起演进,其余的日志代码变更通常包含对反模式的修复。我们手工检查了来自三个维护良好的开源系统(ActiveMQ、Hadoop和Maven)的352对独立更改的日志代码片段。我们的分析在日志代码中产生了六种不同的反模式。为了演示我们的发现的价值,我们将这些反模式编码到静态代码分析工具LCAnalyzer中。案例研究表明,LCAnalyzer的平均召回率为95%,精度为60%,可用于自动检测源代码中以前未知的反模式。为了收集反馈,我们从10个开源软件系统的最新版本中整理了64个具有代表性的日志代码反模式实例。其中,46个实例(72%)已经被开发人员接受。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信