{"title":"Isolating Compiler Faults Through Differentiated Compilation Configurations","authors":"Yibiao Yang;Qingyang Li;Maolin Sun;Jing Yang;Jiangchang Wu;Yuming Zhou","doi":"10.1109/TSE.2025.3569530","DOIUrl":null,"url":null,"abstract":"Compilation optimization bugs are prevalent and can significantly affect the correctness of software products, posing serious challenges to software development. Identifying and localizing these bugs are critical tasks for compiler developers. However, the intricate nature and extensive scale of modern compilers make it difficult to pinpointing the root causes of such bugs. Previous research has introduced innovative techniques that generate <italic>witness test programs</i>–tests that pass–by mutating bug-triggering test cases, highlighting the importance of this problem and demonstrating the effectiveness of such approaches. Nevertheless, existing techniques based on witness test programs generation suffer from inherent limitations. Specifically, they do not guarantee the successful creation of witness test programs via mutation and are often time-consuming, typically requiring extensive iterations to produce a valid witness test program. In this study, we present <sc>Odfl</small>, a simple yet effective approach for automatically isolating compiler optimization faults by introducing the concept of <italic>differentiated compilation configurations</i>. The core insight behind <sc>Odfl</small> is that modifying compilation settings such as disabling fine-grained compilation flags in GCC or reducing the number of fine-grained compilation passes in LLVM, can suppress the manifestation of compiler bugs triggered by the same test program. Through adjusting these settings, <sc>Odfl</small> creates differentiated compilation configuration that produce multiple compiler executions with distinct pass/fail outcomes. We utilize these differentiated configurations to collect both passing and failing compiler coverage, and then apply <italic>Spectrum-Based Fault Localization (SBFL)</i> techniques to rank compiler source files based on their suspiciousness. Our evaluation of 60 GCC and 50 LLVM compiler bugs demonstrates that <sc>Odfl</small> substantially outperforms state-of-the-art compiler fault localization techniques in terms of both effectiveness and efficiency. Notably, <sc>Odfl</small> achieves over 90% improvement in accurately ranking the top-1 faulty source files compared to three existing techniques–DiWi, RecBi, and LLM4CBI–and reduces fault localization time by more than 99% on average.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 6","pages":"1838-1853"},"PeriodicalIF":5.6000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11002719/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Compilation optimization bugs are prevalent and can significantly affect the correctness of software products, posing serious challenges to software development. Identifying and localizing these bugs are critical tasks for compiler developers. However, the intricate nature and extensive scale of modern compilers make it difficult to pinpointing the root causes of such bugs. Previous research has introduced innovative techniques that generate witness test programs–tests that pass–by mutating bug-triggering test cases, highlighting the importance of this problem and demonstrating the effectiveness of such approaches. Nevertheless, existing techniques based on witness test programs generation suffer from inherent limitations. Specifically, they do not guarantee the successful creation of witness test programs via mutation and are often time-consuming, typically requiring extensive iterations to produce a valid witness test program. In this study, we present Odfl, a simple yet effective approach for automatically isolating compiler optimization faults by introducing the concept of differentiated compilation configurations. The core insight behind Odfl is that modifying compilation settings such as disabling fine-grained compilation flags in GCC or reducing the number of fine-grained compilation passes in LLVM, can suppress the manifestation of compiler bugs triggered by the same test program. Through adjusting these settings, Odfl creates differentiated compilation configuration that produce multiple compiler executions with distinct pass/fail outcomes. We utilize these differentiated configurations to collect both passing and failing compiler coverage, and then apply Spectrum-Based Fault Localization (SBFL) techniques to rank compiler source files based on their suspiciousness. Our evaluation of 60 GCC and 50 LLVM compiler bugs demonstrates that Odfl substantially outperforms state-of-the-art compiler fault localization techniques in terms of both effectiveness and efficiency. Notably, Odfl achieves over 90% improvement in accurately ranking the top-1 faulty source files compared to three existing techniques–DiWi, RecBi, and LLM4CBI–and reduces fault localization time by more than 99% on average.
期刊介绍:
IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include:
a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models.
b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects.
c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards.
d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues.
e) System issues: Hardware-software trade-offs.
f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.