Boosting SMT solver performance on mixed-bitwise-arithmetic expressions

Dongpeng Xu, Binbin Liu, Weijie Feng, Jiang Ming, Qilong Zheng, Jing Li, Qiaoyan Yu
{"title":"Boosting SMT solver performance on mixed-bitwise-arithmetic expressions","authors":"Dongpeng Xu, Binbin Liu, Weijie Feng, Jiang Ming, Qilong Zheng, Jing Li, Qiaoyan Yu","doi":"10.1145/3453483.3454068","DOIUrl":null,"url":null,"abstract":"Satisfiability Modulo Theories (SMT) solvers have been widely applied in automated software analysis to reason about the queries that encode the essence of program semantics, relieving the heavy burden of manual analysis. Many SMT solving techniques rely on solving Boolean satisfiability problem (SAT), which is an NP-complete problem, so they use heuristic search strategies to seek possible solutions, especially when no known theorem can efficiently reduce the problem. An emerging challenge, named Mixed-Bitwise-Arithmetic (MBA) obfuscation, impedes SMT solving by constructing identity equations with both bitwise operations (and, or, negate) and arithmetic computation (add, minus, multiply). Common math theorems for bitwise or arithmetic computation are inapplicable to simplifying MBA equations, leading to performance bottlenecks in SMT solving. In this paper, we first scrutinize solvers' performance on solving different categories of MBA expressions: linear, polynomial, and non-polynomial. We observe that solvers can handle simple linear MBA expressions, but facing a severe performance slowdown when solving complex linear and non-linear MBA expressions. The root cause is that complex MBA expressions break the reduction laws for pure arithmetic or bitwise computation. To boost solvers' performance, we propose a semantic-preserving transformation to reduce the mixing degree of bitwise and arithmetic operations. We first calculate a signature vector based on the truth table extracted from an MBA expression, which captures the complete MBA semantics. Next, we generate a simpler MBA expression from the signature vector. Our large-scale evaluation on 3000 complex MBA equations shows that our technique significantly boost modern SMT solvers' performance on solving MBA formulas.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"90 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3453483.3454068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Satisfiability Modulo Theories (SMT) solvers have been widely applied in automated software analysis to reason about the queries that encode the essence of program semantics, relieving the heavy burden of manual analysis. Many SMT solving techniques rely on solving Boolean satisfiability problem (SAT), which is an NP-complete problem, so they use heuristic search strategies to seek possible solutions, especially when no known theorem can efficiently reduce the problem. An emerging challenge, named Mixed-Bitwise-Arithmetic (MBA) obfuscation, impedes SMT solving by constructing identity equations with both bitwise operations (and, or, negate) and arithmetic computation (add, minus, multiply). Common math theorems for bitwise or arithmetic computation are inapplicable to simplifying MBA equations, leading to performance bottlenecks in SMT solving. In this paper, we first scrutinize solvers' performance on solving different categories of MBA expressions: linear, polynomial, and non-polynomial. We observe that solvers can handle simple linear MBA expressions, but facing a severe performance slowdown when solving complex linear and non-linear MBA expressions. The root cause is that complex MBA expressions break the reduction laws for pure arithmetic or bitwise computation. To boost solvers' performance, we propose a semantic-preserving transformation to reduce the mixing degree of bitwise and arithmetic operations. We first calculate a signature vector based on the truth table extracted from an MBA expression, which captures the complete MBA semantics. Next, we generate a simpler MBA expression from the signature vector. Our large-scale evaluation on 3000 complex MBA equations shows that our technique significantly boost modern SMT solvers' performance on solving MBA formulas.
提高SMT求解器对混合位算术表达式的性能
可满足模理论(SMT)解算器已广泛应用于自动化软件分析中,用于推理编码程序语义本质的查询,减轻了人工分析的繁重负担。许多SMT求解技术依赖于求解布尔可满足性问题(SAT),这是一个np完全问题,因此它们使用启发式搜索策略来寻找可能的解,特别是当没有已知定理可以有效地简化问题时。一个名为混合位算术(MBA)混淆的新挑战,通过使用位运算(和、或、否定)和算术计算(加、减、乘)构建单位方程,阻碍了SMT的求解。位或算术计算的常见数学定理不适用于简化MBA方程,导致SMT求解的性能瓶颈。在本文中,我们首先仔细研究求解器在求解不同类别的MBA表达式(线性、多项式和非多项式)时的性能。我们观察到,求解器可以处理简单的线性MBA表达式,但在求解复杂的线性和非线性MBA表达式时面临严重的性能放缓。根本原因是复杂的MBA表达式违反了纯算术或按位计算的约简定律。为了提高求解器的性能,我们提出了一种语义保留变换来降低位运算和算术运算的混合程度。我们首先基于从MBA表达式中提取的真值表计算一个签名向量,它捕获了完整的MBA语义。接下来,我们从签名向量生成一个更简单的MBA表达式。我们对3000个复杂MBA方程的大规模评估表明,我们的技术显著提高了现代SMT求解器在求解MBA公式方面的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信