Ad hoc Test Generation Through Binary Rewriting

2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM) Pub Date : 2020-09-01 DOI:10.1109/SCAM51674.2020.00018

Anthony Saieva, S. Singh, G. Kaiser

{"title":"Ad hoc Test Generation Through Binary Rewriting","authors":"Anthony Saieva, S. Singh, G. Kaiser","doi":"10.1109/SCAM51674.2020.00018","DOIUrl":null,"url":null,"abstract":"When a security vulnerability or other critical bug is not detected by the developers’ test suite, and is discovered post-deployment, developers must quickly devise a new test that reproduces the buggy behavior. Then the developers need to test whether their candidate patch indeed fixes the bug, without breaking other functionality, while racing to deploy before attackers pounce on exposed user installations. This can be challenging when factors in a specific user environment triggered the bug. If enabled, however, record-replay technology faithfully replays the execution in the developer environment as if the program were executing in that user environment under the same conditions as the bug manifested. This includes intermediate program states dependent on system calls, memory layout, etc. as well as any externally-visible behavior. Many modern record-replay tools integrate interactive debuggers, to help locate the root cause, but don’t help the developers test whether their patch indeed eliminates the bug under those same conditions. In particular, modern record-replay tools that reproduce intermediate program state cannot replay recordings made with one version of a program using a different version of the program where the differences affect program state. This work builds on record-replay and binary rewriting to automatically generate and run targeted tests for candidate patches significantly faster and more efficiently than traditional test suite generation techniques like symbolic execution. These tests reflect the arbitrary (ad hoc) user and system circumstances that uncovered the bug, enabling developers to check whether a patch indeed fixes that bug. The tests essentially replay recordings made with one version of a program using a different version of the program, even when the the differences impact program state, by manipulating both the binary executable and the recorded log to result in an execution consistent with what would have happened had the the patched version executed in the user environment under the same conditions where the bug manifested with the original version. Our approach also enables users to make new recordings of their own workloads with the original version of the program, and automatically generate and run the corresponding ad hoc tests on the patched version, to validate that the patch does not break functionality they rely on.","PeriodicalId":410351,"journal":{"name":"2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM51674.2020.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

When a security vulnerability or other critical bug is not detected by the developers’ test suite, and is discovered post-deployment, developers must quickly devise a new test that reproduces the buggy behavior. Then the developers need to test whether their candidate patch indeed fixes the bug, without breaking other functionality, while racing to deploy before attackers pounce on exposed user installations. This can be challenging when factors in a specific user environment triggered the bug. If enabled, however, record-replay technology faithfully replays the execution in the developer environment as if the program were executing in that user environment under the same conditions as the bug manifested. This includes intermediate program states dependent on system calls, memory layout, etc. as well as any externally-visible behavior. Many modern record-replay tools integrate interactive debuggers, to help locate the root cause, but don’t help the developers test whether their patch indeed eliminates the bug under those same conditions. In particular, modern record-replay tools that reproduce intermediate program state cannot replay recordings made with one version of a program using a different version of the program where the differences affect program state. This work builds on record-replay and binary rewriting to automatically generate and run targeted tests for candidate patches significantly faster and more efficiently than traditional test suite generation techniques like symbolic execution. These tests reflect the arbitrary (ad hoc) user and system circumstances that uncovered the bug, enabling developers to check whether a patch indeed fixes that bug. The tests essentially replay recordings made with one version of a program using a different version of the program, even when the the differences impact program state, by manipulating both the binary executable and the recorded log to result in an execution consistent with what would have happened had the the patched version executed in the user environment under the same conditions where the bug manifested with the original version. Our approach also enables users to make new recordings of their own workloads with the original version of the program, and automatically generate and run the corresponding ad hoc tests on the patched version, to validate that the patch does not break functionality they rely on.

查看原文本刊更多论文

通过二进制重写生成特殊测试

当开发人员的测试套件没有检测到安全漏洞或其他关键错误，并且在部署后发现时，开发人员必须快速设计一个新的测试来重现有错误的行为。那么开发人员需要测试他们的候选人是否确实补丁修复bug,在不破坏其他功能,而竞相部署之前攻击者突然袭击暴露用户安装。当特定用户环境中的因素触发bug时，这可能具有挑战性。但是，如果启用了record-replay技术，则会忠实地在开发人员环境中重播执行，就好像程序是在与出现错误的相同条件下在用户环境中执行一样。这包括依赖于系统调用、内存布局等的中间程序状态，以及任何外部可见的行为。许多现代record-replay工具集成交互式调试器,帮助找到问题的根源,但不要帮助开发人员测试他们的补丁是否确实消除了错误在相同的条件下。特别是，复制中间程序状态的现代记录重放工具不能使用不同版本的程序重放一个版本的程序录制的记录，因为差异会影响程序状态。这项工作建立在record-replay和二进制重写为候选人自动生成并运行目标测试补丁显著更快和更有效地生成符号执行技术比传统的测试套件。这些测试反映任意(临时)用户和系统的情况下,发现错误,使开发人员能够检查是否确实一个补丁修复bug。测试本质上是用不同版本的程序重放一个版本的程序记录，即使差异会影响程序状态，通过操纵二进制可执行文件和记录的日志，使执行结果与在用户环境中以原始版本出现错误的相同条件执行补丁版本时所发生的情况一致。我们的方法还使用户能够使用程序的原始版本对他们自己的工作负载进行新的记录，并自动生成和运行补丁版本上相应的特别测试，以验证补丁不会破坏他们所依赖的功能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM)

自引率

0.00%

发文量