Assessing the strength of Metamorphic Testing applied to optimisation software—Experience from industry

IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Alejandra Duque-Torres , Claus Klammer , Stefan Fischer , Dietmar Pfahl , Rudolf Ramler
{"title":"Assessing the strength of Metamorphic Testing applied to optimisation software—Experience from industry","authors":"Alejandra Duque-Torres ,&nbsp;Claus Klammer ,&nbsp;Stefan Fischer ,&nbsp;Dietmar Pfahl ,&nbsp;Rudolf Ramler","doi":"10.1016/j.infsof.2025.107807","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>The testing of optimisation algorithms (OAs) is difficult due to the test oracle problem. Metamorphic Testing (MT) addresses this challenge. We previously applied MT to a black-box industrial OA and gained first insights into applying MT for OA testing.</div></div><div><h3>Objective:</h3><div>We noticed that some of the identified Metamorphic Relations (MRs) seemed to be more powerful than others. We now define and evaluate an approach to assess and rank MRs in terms of their defect-detection capabilities.</div></div><div><h3>Method:</h3><div>We propose a three-phase approach for assessing the strength of MRs. First, we evaluate the applicability of each MR based on Test Data (TD) using MetaTrimmer, an approach for selecting and constraining MRs. Second, we generate System Under Test (SUT) mutants and classify them into three levels. Level one contains mutants that do not change their behaviour for all TD (equivalent mutants). Level two contains all mutants that change behaviour for every TD (trivial mutants). Level three contains all other mutants. Third, we assess MR effectiveness per level based on the MR’s violation/non-violation ratio.</div></div><div><h3>Results:</h3><div>Among 405 generated SUT mutants analysed, 236 fell into level one, 85 into level two, and 84 into level three. The analysis of the amount of TD triggering an MR violation in each level per mutant revealed that some MRs have higher sensitivity than others.</div></div><div><h3>Conclusion:</h3><div>Our findings show that assessing MRs using our three-level strategy provides clear insights into their defect-detection capabilities. By integrating MetaTrimmer with mutation testing, we identify MRs that are effective in catching faults and sensitive to variations in TD. While evaluated on an OA in an industrial setting, this approach is generalisable to other SUTs. It is actionable for practitioners and researchers seeking to identify robust MRs and offers a structured methodology to evaluate and rank them.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"186 ","pages":"Article 107807"},"PeriodicalIF":4.3000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925001466","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Context:

The testing of optimisation algorithms (OAs) is difficult due to the test oracle problem. Metamorphic Testing (MT) addresses this challenge. We previously applied MT to a black-box industrial OA and gained first insights into applying MT for OA testing.

Objective:

We noticed that some of the identified Metamorphic Relations (MRs) seemed to be more powerful than others. We now define and evaluate an approach to assess and rank MRs in terms of their defect-detection capabilities.

Method:

We propose a three-phase approach for assessing the strength of MRs. First, we evaluate the applicability of each MR based on Test Data (TD) using MetaTrimmer, an approach for selecting and constraining MRs. Second, we generate System Under Test (SUT) mutants and classify them into three levels. Level one contains mutants that do not change their behaviour for all TD (equivalent mutants). Level two contains all mutants that change behaviour for every TD (trivial mutants). Level three contains all other mutants. Third, we assess MR effectiveness per level based on the MR’s violation/non-violation ratio.

Results:

Among 405 generated SUT mutants analysed, 236 fell into level one, 85 into level two, and 84 into level three. The analysis of the amount of TD triggering an MR violation in each level per mutant revealed that some MRs have higher sensitivity than others.

Conclusion:

Our findings show that assessing MRs using our three-level strategy provides clear insights into their defect-detection capabilities. By integrating MetaTrimmer with mutation testing, we identify MRs that are effective in catching faults and sensitive to variations in TD. While evaluated on an OA in an industrial setting, this approach is generalisable to other SUTs. It is actionable for practitioners and researchers seeking to identify robust MRs and offers a structured methodology to evaluate and rank them.

Abstract Image

评估变形测试在优化软件中的应用——来自行业的经验
上下文:由于测试oracle问题,优化算法(oa)的测试很困难。变形测试(MT)解决了这一挑战。我们以前将机器翻译应用于黑盒工业OA,并获得了将机器翻译应用于OA测试的第一个见解。目的:我们注意到一些已确定的变质关系(MRs)似乎比其他关系更强大。我们现在定义并评估一种方法,根据它们的缺陷检测能力对MRs进行评估和排序。方法:我们提出了一种评估mrs强度的三阶段方法,首先,我们使用MetaTrimmer(一种选择和约束mrs的方法)基于测试数据(Test Data, TD)评估每个MR的适用性。其次,我们生成System Under Test (SUT)突变体并将其分为三个级别。第一级包含不会改变所有TD(等效突变体)行为的突变体。第二级包含所有改变每个TD行为的突变体(琐碎突变体)。第三层的人都是变种人。第三,我们根据MR的违规/非违规比率评估每个级别的MR有效性。结果:在所分析的405个SUT突变体中,一级突变体236个,二级突变体85个,三级突变体84个。对每个突变体中触发MR违规的TD量的分析显示,一些MR比其他MR具有更高的敏感性。结论:我们的研究结果表明,使用我们的三级策略评估MRs提供了对其缺陷检测能力的清晰见解。通过将MetaTrimmer与突变检测相结合,我们确定了能够有效捕获故障并对TD变化敏感的MRs。虽然在工业环境中对OA进行了评估,但该方法可推广到其他sut。对于寻求确定稳健的MRs的从业者和研究人员来说,它是可行的,并提供了一种结构化的方法来评估和排名它们。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information and Software Technology
Information and Software Technology 工程技术-计算机:软件工程
CiteScore
9.10
自引率
7.70%
发文量
164
审稿时长
9.6 weeks
期刊介绍: Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信