A Lightweight Fault Localization Approach based on XGBoost

Bo Yang, Yuze He, Huai Liu, Yixin Chen, Zhi Jin
{"title":"A Lightweight Fault Localization Approach based on XGBoost","authors":"Bo Yang, Yuze He, Huai Liu, Yixin Chen, Zhi Jin","doi":"10.1109/QRS51102.2020.00033","DOIUrl":null,"url":null,"abstract":"Software fault localization is one of the key activities in software debugging. The program spectrum-based approach is widely used in fault localization. However, lots of program information, for example, the sequence of the execution statement and statement semantics, is missing when such an approach is utilized, which affects the performance. XGBoost is an effective learning algorithm, which can use the characteristics of the training data to build a classification tree during training. In addition, XGBoost can iteratively adjust the information value of the feature, so that the training process retains the importance information of the feature. This paper proposes applying XGBoost into fault localization utilizing information of program execution behaviors. A novel method called XGB-FL is developed, where the program spectrum information is converted into a coverage matrix to train the XGBoost model. We can get the characteristics of the data through the trained model and the importance of the program statement in the classification process. This is also the basis for judging whether the statement is likely to contain a fault. Nine representative data sets have been chosen to evaluate the performance of XGB-FL. The experimental results show that XGB-FL can generally deliver a higher performance in fault localization than those baseline techniques, in terms of precision and efficiency.","PeriodicalId":301814,"journal":{"name":"2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS51102.2020.00033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Software fault localization is one of the key activities in software debugging. The program spectrum-based approach is widely used in fault localization. However, lots of program information, for example, the sequence of the execution statement and statement semantics, is missing when such an approach is utilized, which affects the performance. XGBoost is an effective learning algorithm, which can use the characteristics of the training data to build a classification tree during training. In addition, XGBoost can iteratively adjust the information value of the feature, so that the training process retains the importance information of the feature. This paper proposes applying XGBoost into fault localization utilizing information of program execution behaviors. A novel method called XGB-FL is developed, where the program spectrum information is converted into a coverage matrix to train the XGBoost model. We can get the characteristics of the data through the trained model and the importance of the program statement in the classification process. This is also the basis for judging whether the statement is likely to contain a fault. Nine representative data sets have been chosen to evaluate the performance of XGB-FL. The experimental results show that XGB-FL can generally deliver a higher performance in fault localization than those baseline techniques, in terms of precision and efficiency.
基于XGBoost的轻量级故障定位方法
软件故障定位是软件调试的关键环节之一。基于程序谱的方法在故障定位中得到了广泛的应用。但是,在使用这种方法时,会丢失许多程序信息,例如执行语句的顺序和语句语义,从而影响性能。XGBoost是一种有效的学习算法,它可以在训练过程中利用训练数据的特征来构建分类树。此外,XGBoost可以迭代调整特征的信息值,使训练过程保留特征的重要信息。本文提出利用程序执行行为信息,将XGBoost应用于故障定位。提出了一种新的XGB-FL方法,将程序频谱信息转换成覆盖矩阵来训练XGBoost模型。我们可以通过训练好的模型得到数据的特征,以及程序语句在分类过程中的重要性。这也是判断该陈述是否可能包含错误的基础。选择了9个具有代表性的数据集来评估XGB-FL的性能。实验结果表明,在精度和效率方面,XGB-FL在故障定位方面总体上优于现有的基线技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信