Mining Historical Test Logs to Predict Bugs and Localize Faults in the Test Logs

2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) Pub Date : 2019-05-25 DOI:10.1109/ICSE.2019.00031

Anunay Amar, Peter C. Rigby

{"title":"Mining Historical Test Logs to Predict Bugs and Localize Faults in the Test Logs","authors":"Anunay Amar, Peter C. Rigby","doi":"10.1109/ICSE.2019.00031","DOIUrl":null,"url":null,"abstract":"Software testing is an integral part of modern software development. However, test runs can produce thousands of lines of logged output that make it difficult to find the cause of a fault in the logs. This problem is exacerbated by environmental failures that distract from product faults. In this paper we present techniques with the goal of capturing the maximum number of product faults, while flagging the minimum number of log lines for inspection. We observe that the location of a fault in a log should be contained in the lines of a failing test log. In contrast, a passing test log should not contain the lines related to a failure. Lines that occur in both a passing and failing log introduce noise when attempting to find the fault in a failing log. We introduce an approach where we remove the lines that occur in the passing log from the failing log. After removing these lines, we use information retrieval techniques to flag the most probable lines for investigation. We modify TF-IDF to identify the most relevant log lines related to past product failures. We then vectorize the logs and develop an exclusive version of KNN to identify which logs are likely to lead to product faults and which lines are the most probable indication of the failure. Our best approach, LogFaultFlagger finds 89% of the total faults and flags less than 1% of the total failed log lines for inspection. LogFaultFlagger drastically outperforms the previous work CAM. We implemented LogFaultFlagger as a tool at Ericsson where it presents fault prediction summaries to base station testers.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"15 1","pages":"140-151"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2019.00031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 32

Abstract

Software testing is an integral part of modern software development. However, test runs can produce thousands of lines of logged output that make it difficult to find the cause of a fault in the logs. This problem is exacerbated by environmental failures that distract from product faults. In this paper we present techniques with the goal of capturing the maximum number of product faults, while flagging the minimum number of log lines for inspection. We observe that the location of a fault in a log should be contained in the lines of a failing test log. In contrast, a passing test log should not contain the lines related to a failure. Lines that occur in both a passing and failing log introduce noise when attempting to find the fault in a failing log. We introduce an approach where we remove the lines that occur in the passing log from the failing log. After removing these lines, we use information retrieval techniques to flag the most probable lines for investigation. We modify TF-IDF to identify the most relevant log lines related to past product failures. We then vectorize the logs and develop an exclusive version of KNN to identify which logs are likely to lead to product faults and which lines are the most probable indication of the failure. Our best approach, LogFaultFlagger finds 89% of the total faults and flags less than 1% of the total failed log lines for inspection. LogFaultFlagger drastically outperforms the previous work CAM. We implemented LogFaultFlagger as a tool at Ericsson where it presents fault prediction summaries to base station testers.

查看原文本刊更多论文

挖掘历史测试日志以预测错误并定位测试日志中的错误

软件测试是现代软件开发的重要组成部分。但是，测试运行可能产生数千行日志输出，这使得很难在日志中找到故障的原因。环境问题分散了人们对产品故障的注意力，从而加剧了这一问题。在本文中，我们提出的技术目标是捕获最大数量的产品故障，同时标记最少数量的日志行以供检查。我们注意到，日志中故障的位置应该包含在失败测试日志的行中。相反，通过的测试日志不应该包含与失败相关的行。在尝试查找失败日志中的故障时，在通过和失败日志中都出现的行会引入噪声。我们引入了一种方法，将传递日志中出现的行从失败日志中删除。在删除这些线之后，我们使用信息检索技术来标记最可能的线进行调查。我们修改TF-IDF以识别与过去产品故障相关的最相关日志线。然后，我们对日志进行矢量化，并开发KNN的专有版本，以确定哪些日志可能导致产品故障，哪些行是最可能的故障指示。我们最好的方法是LogFaultFlagger，它能发现总故障的89%，并标记出少于1%的总失败日志行以供检查。LogFaultFlagger的性能大大优于以前的工作CAM。我们在爱立信实现了LogFaultFlagger作为一个工具，它向基站测试人员提供故障预测摘要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量