Boosting Spectrum-Based Fault Localization for Spreadsheets with Product Metrics in a Learning Approach

Adil Mukhtar, Birgit Hofer, D. Jannach, F. Wotawa, Konstantin Schekotihin
{"title":"Boosting Spectrum-Based Fault Localization for Spreadsheets with Product Metrics in a Learning Approach","authors":"Adil Mukhtar, Birgit Hofer, D. Jannach, F. Wotawa, Konstantin Schekotihin","doi":"10.1145/3551349.3559546","DOIUrl":null,"url":null,"abstract":"Faults in spreadsheets are not uncommon and they can have significant negative consequences in practice. Various approaches for fault localization were proposed in recent years, among them techniques that transferred ideas from spectrum-based fault localization (SFL) to the spreadsheet domain. Applying SFL to spreadsheets proved to be effective, but has certain limitations. Specifically, the constrained computational structures of spreadsheets may lead to large sets of cells that have the same assumed fault probability according to SFL and thus have to be inspected manually. In this work, we propose to combine SFL with a fault prediction method based on spreadsheet metrics in a machine learning (ML) approach. In particular, we train supervised ML models using two orthogonal types of features: (i) variables that are used to compute similarity coefficients in SFL and (ii) spreadsheet metrics that have shown to be good predictors for faulty formulas in previous work. Experiments with a widely-used corpus of faulty spreadsheets indicate that the combined model helps to significantly improve fault localization performance in terms of wasted effort and accuracy.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3551349.3559546","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Faults in spreadsheets are not uncommon and they can have significant negative consequences in practice. Various approaches for fault localization were proposed in recent years, among them techniques that transferred ideas from spectrum-based fault localization (SFL) to the spreadsheet domain. Applying SFL to spreadsheets proved to be effective, but has certain limitations. Specifically, the constrained computational structures of spreadsheets may lead to large sets of cells that have the same assumed fault probability according to SFL and thus have to be inspected manually. In this work, we propose to combine SFL with a fault prediction method based on spreadsheet metrics in a machine learning (ML) approach. In particular, we train supervised ML models using two orthogonal types of features: (i) variables that are used to compute similarity coefficients in SFL and (ii) spreadsheet metrics that have shown to be good predictors for faulty formulas in previous work. Experiments with a widely-used corpus of faulty spreadsheets indicate that the combined model helps to significantly improve fault localization performance in terms of wasted effort and accuracy.
基于学习方法的产品度量提高电子表格基于频谱的故障定位
电子表格中的错误并不罕见,它们在实践中可能会产生重大的负面后果。近年来,人们提出了各种各样的故障定位方法,其中包括将基于频谱的故障定位(SFL)思想转移到电子表格领域的技术。将SFL应用于电子表格被证明是有效的,但有一定的局限性。具体来说,电子表格的约束计算结构可能会导致大量的单元格集合,这些单元格根据SFL具有相同的假定故障概率,因此必须手动检查。在这项工作中,我们提出在机器学习(ML)方法中将SFL与基于电子表格指标的故障预测方法结合起来。特别是,我们使用两种正交类型的特征来训练有监督的ML模型:(i)用于计算SFL中相似系数的变量和(ii)电子表格指标,这些指标在以前的工作中已被证明是错误公式的良好预测因子。对广泛使用的故障电子表格语料库进行的实验表明,该组合模型在浪费精力和准确性方面显著提高了故障定位性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信