{"title":"Automatically Detecting Variability Bugs Through Hybrid Control and Data Flow Analysis","authors":"Kelly Kaoudis, Henrik Brodin, E. Sultanik","doi":"10.1109/SPW59333.2023.00022","DOIUrl":null,"url":null,"abstract":"Subtle bugs that only manifest in certain software configurations are notoriously difficult to correctly trace. Sometimes called Heisenbugs, these runtime variability flaws can result from invoking undefined behavior in languages like C and C++, or from compiler flaws. In this paper, we present a novel analysis technique for detecting and correctly diagnosing variability bugs' impact on a program through comparing control-affecting data flow across differently compiled program variants. Our UBet prototype dynamically derives a runtime control flow trace while tracing universal data flow for a program processing a given input, operating at a level of tracing completeness not achievable through similar dynamic instrumentation means. Sans compiler bugs or undefined behavior, every compile-time program configuration (i.e., compiler flags vary) should be semantically equivalent. Thus, any input for which a program variant produces inconsistent output indicates a variability bug. Our analysis compares control-affecting data flow traces from disagreeing program version runs to identify related input bytes and determine where in the program the processing variability originates. Though we initially demonstrate our technique on C++ variability bugs in Nitro, the American Department of Defense NITF (National Imagery Transmission Format) reference implementation parser, our approach applies equally to other programs and input types beyond NITF parsers. Finally, we sketch a path toward completing this work and refining our analysis, including evaluating parsers of other input formats.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Security and Privacy Workshops (SPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPW59333.2023.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Subtle bugs that only manifest in certain software configurations are notoriously difficult to correctly trace. Sometimes called Heisenbugs, these runtime variability flaws can result from invoking undefined behavior in languages like C and C++, or from compiler flaws. In this paper, we present a novel analysis technique for detecting and correctly diagnosing variability bugs' impact on a program through comparing control-affecting data flow across differently compiled program variants. Our UBet prototype dynamically derives a runtime control flow trace while tracing universal data flow for a program processing a given input, operating at a level of tracing completeness not achievable through similar dynamic instrumentation means. Sans compiler bugs or undefined behavior, every compile-time program configuration (i.e., compiler flags vary) should be semantically equivalent. Thus, any input for which a program variant produces inconsistent output indicates a variability bug. Our analysis compares control-affecting data flow traces from disagreeing program version runs to identify related input bytes and determine where in the program the processing variability originates. Though we initially demonstrate our technique on C++ variability bugs in Nitro, the American Department of Defense NITF (National Imagery Transmission Format) reference implementation parser, our approach applies equally to other programs and input types beyond NITF parsers. Finally, we sketch a path toward completing this work and refining our analysis, including evaluating parsers of other input formats.