Almost correct invariants: synthesizing inductive invariants by fuzzing proofs

Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis Pub Date : 2022-07-18 DOI:10.1145/3533767.3534381

S. Lahiri, Subhajit Roy

{"title":"Almost correct invariants: synthesizing inductive invariants by fuzzing proofs","authors":"S. Lahiri, Subhajit Roy","doi":"10.1145/3533767.3534381","DOIUrl":null,"url":null,"abstract":"Real-life programs contain multiple operations whose semantics are unavailable to verification engines, like third-party library calls, inline assembly and SIMD instructions, special compiler-provided primitives, and queries to uninterpretable machine learning models. Even with the exceptional success story of program verification, synthesis of inductive invariants for such \"open\" programs has remained a challenge. Currently, this problem is handled by manually \"closing\" the program---by providing hand-written stubs that attempt to capture the behavior of the unmodelled operations; writing stubs is not only difficult and tedious, but the stubs are often incorrect---raising serious questions on the whole endeavor. In this work, we propose Almost Correct Invariants as an automated strategy for synthesizing inductive invariants for such \"open\" programs. We adopt an active learning strategy where a data-driven learner proposes candidate invariants. In deviation from prior work that attempt to verify invariants, we attempt to falsify the invariants: we reduce the falsification problem to a set of reachability checks on non-deterministic programs; we ride on the success of modern fuzzers to answer these reachability queries. Our tool, Achar, automatically synthesizes inductive invariants that are sufficient to prove the correctness of the target programs. We compare Achar with a state-of-the-art invariant synthesis tool that employs theorem proving on formulae built over the program source. Though Achar is without strong soundness guarantees, our experiments show that even when we provide almost no access to the program source, Achar outperforms the state-of-the-art invariant generator that has complete access to the source. We also evaluate Achar on programs that current invariant synthesis engines cannot handle---programs that invoke external library calls, inline assembly, and queries to convolution neural networks; Achar successfully infers the necessary inductive invariants within a reasonable time.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533767.3534381","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Real-life programs contain multiple operations whose semantics are unavailable to verification engines, like third-party library calls, inline assembly and SIMD instructions, special compiler-provided primitives, and queries to uninterpretable machine learning models. Even with the exceptional success story of program verification, synthesis of inductive invariants for such "open" programs has remained a challenge. Currently, this problem is handled by manually "closing" the program---by providing hand-written stubs that attempt to capture the behavior of the unmodelled operations; writing stubs is not only difficult and tedious, but the stubs are often incorrect---raising serious questions on the whole endeavor. In this work, we propose Almost Correct Invariants as an automated strategy for synthesizing inductive invariants for such "open" programs. We adopt an active learning strategy where a data-driven learner proposes candidate invariants. In deviation from prior work that attempt to verify invariants, we attempt to falsify the invariants: we reduce the falsification problem to a set of reachability checks on non-deterministic programs; we ride on the success of modern fuzzers to answer these reachability queries. Our tool, Achar, automatically synthesizes inductive invariants that are sufficient to prove the correctness of the target programs. We compare Achar with a state-of-the-art invariant synthesis tool that employs theorem proving on formulae built over the program source. Though Achar is without strong soundness guarantees, our experiments show that even when we provide almost no access to the program source, Achar outperforms the state-of-the-art invariant generator that has complete access to the source. We also evaluate Achar on programs that current invariant synthesis engines cannot handle---programs that invoke external library calls, inline assembly, and queries to convolution neural networks; Achar successfully infers the necessary inductive invariants within a reasonable time.

查看原文本刊更多论文

几乎正确的不变量:用模糊证明综合归纳不变量

现实生活中的程序包含多个操作，这些操作的语义对于验证引擎来说是不可用的，比如第三方库调用、内联汇编和SIMD指令、编译器提供的特殊原语，以及对不可解释的机器学习模型的查询。即使有了程序验证的成功案例，对这种“开放”程序的归纳不变量的综合仍然是一个挑战。目前，这个问题是通过手动“关闭”程序来处理的——通过提供试图捕获未建模操作行为的手写存根;写存根不仅困难和乏味，而且存根经常是不正确的——这会给整个努力带来严重的问题。在这项工作中，我们提出了几乎正确的不变量作为一种自动化的策略来综合归纳不变量对于这样的“开放”程序。我们采用主动学习策略，其中数据驱动的学习者提出候选不变量。与先前试图验证不变量的工作不同，我们试图证伪不变量:我们将证伪问题简化为一组非确定性程序的可达性检查;我们依靠现代模糊器的成功来回答这些可达性问题。我们的工具Achar自动合成归纳不变量，这些不变量足以证明目标程序的正确性。我们将Achar与最先进的不变综合工具进行比较，该工具在程序源上构建的公式上使用定理证明。虽然Achar没有很强的可靠性保证，但我们的实验表明，即使我们几乎不提供对程序源代码的访问，Achar的性能也优于最先进的不变量生成器，后者可以完全访问源代码。我们还在当前不变合成引擎无法处理的程序上评估Achar——调用外部库调用的程序，内联汇编，以及对卷积神经网络的查询;Achar在合理的时间内成功地推导出了必要的归纳不变量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis

自引率

0.00%

发文量