BabelRTS: Polyglot Regression Test Selection

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

IEEE Transactions on Software Engineering Pub Date : 2025-03-27 DOI:10.1109/TSE.2025.3554403

Gabriele Maurina;Walter Cazzola;Sudipto Ghosh

{"title":"BabelRTS: Polyglot Regression Test Selection","authors":"Gabriele Maurina;Walter Cazzola;Sudipto Ghosh","doi":"10.1109/TSE.2025.3554403","DOIUrl":null,"url":null,"abstract":"Regression test selection (RTS) approaches reduce the number of regression tests. Current RTS approaches are typically monoglot, i.e., their implementations target a specific language. However, many subjects under test (SUT) are polyglot, i.e., they use multiple languages. Running multiple monoglot RTS approaches separately on a polyglot SUT is unsafe because tests that involve inter-language dependencies can be missed. Moreover, a new language may require completely reimplementing an RTS approach, especially if the original implementation relies on language and runtime features that are not available in the new language. We propose a new static approach called BabelRTS, which is multilingual (supports multiple languages out of the box), polyglot (analyzes SUTs written in multiple languages), and extensible (allows adding support for new languages). A key contribution is the idea of encapsulating the language-specific aspects of RTS by using patterns and actions. A pattern specifies programming language constructs used in each file that indicate dependencies to other files written in the same or a different language. An action specifies how to identify these files in the codebase. Patterns and actions can be customized to support new languages without modifying the test selection algorithm. BabelRTS is not tied to a specific language run-time system or paradigm. BabelRTS currently supports 12 languages and 5 language combinations. We evaluated BabelRTS on 142 open-source monoglot and polyglot SUTs, analyzing a total of more than two billion LOC. The performance of BabelRTS was similar to the state-of-the-art monoglot approaches on monoglot SUTs. On polyglot SUTs, BabelRTS was safer in polyglot mode and selected more tests for 60% of the commits than in monoglot mode, which missed inter-language dependencies.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 5","pages":"1487-1499"},"PeriodicalIF":6.5000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10944548/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Regression test selection (RTS) approaches reduce the number of regression tests. Current RTS approaches are typically monoglot, i.e., their implementations target a specific language. However, many subjects under test (SUT) are polyglot, i.e., they use multiple languages. Running multiple monoglot RTS approaches separately on a polyglot SUT is unsafe because tests that involve inter-language dependencies can be missed. Moreover, a new language may require completely reimplementing an RTS approach, especially if the original implementation relies on language and runtime features that are not available in the new language. We propose a new static approach called BabelRTS, which is multilingual (supports multiple languages out of the box), polyglot (analyzes SUTs written in multiple languages), and extensible (allows adding support for new languages). A key contribution is the idea of encapsulating the language-specific aspects of RTS by using patterns and actions. A pattern specifies programming language constructs used in each file that indicate dependencies to other files written in the same or a different language. An action specifies how to identify these files in the codebase. Patterns and actions can be customized to support new languages without modifying the test selection algorithm. BabelRTS is not tied to a specific language run-time system or paradigm. BabelRTS currently supports 12 languages and 5 language combinations. We evaluated BabelRTS on 142 open-source monoglot and polyglot SUTs, analyzing a total of more than two billion LOC. The performance of BabelRTS was similar to the state-of-the-art monoglot approaches on monoglot SUTs. On polyglot SUTs, BabelRTS was safer in polyglot mode and selected more tests for 60% of the commits than in monoglot mode, which missed inter-language dependencies.

查看原文本刊更多论文

BabelRTS：多语言回归测试选择

回归测试选择（RTS）方法减少了回归测试的数量。当前的RTS方法是典型的单一语言，也就是说，它们的实现目标是一种特定的语言。然而，许多被测科目（SUT）是多语言的，即他们使用多种语言。在多语言SUT上单独运行多个单语言RTS方法是不安全的，因为可能会错过涉及语言间依赖关系的测试。此外，新语言可能需要完全重新实现RTS方法，特别是当原始实现依赖于新语言中不可用的语言和运行时功能时。我们提出了一种新的静态方法，称为BabelRTS，它是多语言的（支持开箱即用的多种语言）、多语言的（分析用多种语言编写的sut）和可扩展的（允许添加对新语言的支持）。一个关键的贡献是通过使用模式和操作来封装RTS的特定语言方面。模式指定了每个文件中使用的编程语言结构，这些结构指示了与以相同或不同语言编写的其他文件的依赖关系。操作指定如何在代码库中识别这些文件。可以定制模式和操作来支持新的语言，而无需修改测试选择算法。BabelRTS不依赖于特定的语言运行时系统或范式。BabelRTS目前支持12种语言和5种语言组合。我们在142个开源单语和多语sut上评估了BabelRTS，分析了总共超过20亿个LOC。BabelRTS的性能与单语SUTs上最先进的单语方法相似。在多语言SUTs中，BabelRTS在多语言模式下更安全，并且在60%的提交中选择了比单语言模式更多的测试，而单语言模式错过了语言间依赖关系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.