{"title":"测试用例缩减:一个框架、基准和比较研究","authors":"Patrick Kreutzer, Tom Kunze, M. Philippsen","doi":"10.1109/ICSME52107.2021.00012","DOIUrl":null,"url":null,"abstract":"Given a program that triggers a bug in a compiler (or other kind of language processor), the goal of test case reduction is to cut away all code that is irrelevant for the bug, i.e., to generate a smaller program that still induces the bug. Research has proposed several language-agnostic reduction techniques that automatically reduce bug-inducing programs in arbitrary programming languages, but there is no large-scale, conclusive evaluation of these algorithms yet. Furthermore, the development of new algorithms is hampered by the unavailability of comparable implementations of previous techniques and of diverse test programs that trigger different bugs in real compilers. To close these gaps and to foster future research in this area, this paper makes three contributions: (1) A framework that includes efficient, fine-tuned implementations of 6 state-of-the-art reducers, (2) a diverse benchmark that comprises 321 fuzzer-generated programs in two programming languages that trigger 110 different bugs in real compilers, and (3) a comparative study that builds upon our framework and benchmark and compares the reduction techniques w.r.t. their effectiveness and efficiency. Our results show that there is no reduction technique yet that performs best across all test cases and languages. Our framework and benchmark are available online and we provide the necessary scripts and tools to replicate our study.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"29 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Test Case Reduction: A Framework, Benchmark, and Comparative Study\",\"authors\":\"Patrick Kreutzer, Tom Kunze, M. Philippsen\",\"doi\":\"10.1109/ICSME52107.2021.00012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given a program that triggers a bug in a compiler (or other kind of language processor), the goal of test case reduction is to cut away all code that is irrelevant for the bug, i.e., to generate a smaller program that still induces the bug. Research has proposed several language-agnostic reduction techniques that automatically reduce bug-inducing programs in arbitrary programming languages, but there is no large-scale, conclusive evaluation of these algorithms yet. Furthermore, the development of new algorithms is hampered by the unavailability of comparable implementations of previous techniques and of diverse test programs that trigger different bugs in real compilers. To close these gaps and to foster future research in this area, this paper makes three contributions: (1) A framework that includes efficient, fine-tuned implementations of 6 state-of-the-art reducers, (2) a diverse benchmark that comprises 321 fuzzer-generated programs in two programming languages that trigger 110 different bugs in real compilers, and (3) a comparative study that builds upon our framework and benchmark and compares the reduction techniques w.r.t. their effectiveness and efficiency. Our results show that there is no reduction technique yet that performs best across all test cases and languages. Our framework and benchmark are available online and we provide the necessary scripts and tools to replicate our study.\",\"PeriodicalId\":205629,\"journal\":{\"name\":\"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)\",\"volume\":\"29 6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSME52107.2021.00012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME52107.2021.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Test Case Reduction: A Framework, Benchmark, and Comparative Study
Given a program that triggers a bug in a compiler (or other kind of language processor), the goal of test case reduction is to cut away all code that is irrelevant for the bug, i.e., to generate a smaller program that still induces the bug. Research has proposed several language-agnostic reduction techniques that automatically reduce bug-inducing programs in arbitrary programming languages, but there is no large-scale, conclusive evaluation of these algorithms yet. Furthermore, the development of new algorithms is hampered by the unavailability of comparable implementations of previous techniques and of diverse test programs that trigger different bugs in real compilers. To close these gaps and to foster future research in this area, this paper makes three contributions: (1) A framework that includes efficient, fine-tuned implementations of 6 state-of-the-art reducers, (2) a diverse benchmark that comprises 321 fuzzer-generated programs in two programming languages that trigger 110 different bugs in real compilers, and (3) a comparative study that builds upon our framework and benchmark and compares the reduction techniques w.r.t. their effectiveness and efficiency. Our results show that there is no reduction technique yet that performs best across all test cases and languages. Our framework and benchmark are available online and we provide the necessary scripts and tools to replicate our study.