Fast graph simplification for interleaved Dyck-reachability

Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2020-06-06 DOI:10.1145/3385412.3386021

Yuanbo Li, Qirun Zhang, T. Reps

{"title":"Fast graph simplification for interleaved Dyck-reachability","authors":"Yuanbo Li, Qirun Zhang, T. Reps","doi":"10.1145/3385412.3386021","DOIUrl":null,"url":null,"abstract":"Many program-analysis problems can be formulated as graph-reachability problems. Interleaved Dyck language reachability. Interleaved Dyck language reachability (InterDyck-reachability) is a fundamental framework to express a wide variety of program-analysis problems over edge-labeled graphs. The InterDyck language represents an intersection of multiple matched-parenthesis languages (i.e., Dyck languages). In practice, program analyses typically leverage one Dyck language to achieve context-sensitivity, and other Dyck languages to model data dependences, such as field-sensitivity and pointer references/dereferences. In the ideal case, an InterDyck-reachability framework should model multiple Dyck languages simultaneously. Unfortunately, precise InterDyck-reachability is undecidable. Any practical solution must over-approximate the exact answer. In the literature, a lot of work has been proposed to over-approximate the InterDyck-reachability formulation. This paper offers a new perspective on improving both the precision and the scalability of InterDyck-reachability: we aim to simplify the underlying input graph G. Our key insight is based on the observation that if an edge is not contributing to any InterDyck-path, we can safely eliminate it from G. Our technique is orthogonal to the InterDyck-reachability formulation, and can serve as a pre-processing step with any over-approximating approaches for InterDyck-reachability. We have applied our graph simplification algorithm to pre-processing the graphs from a recent InterDyck-reachability-based taint analysis for Android. Our evaluation on three popular InterDyck-reachability algorithms yields promising results. In particular, our graph-simplification method improves both the scalability and precision of all three InterDyck-reachability algorithms, sometimes dramatically.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"82 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3385412.3386021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

Abstract

Many program-analysis problems can be formulated as graph-reachability problems. Interleaved Dyck language reachability. Interleaved Dyck language reachability (InterDyck-reachability) is a fundamental framework to express a wide variety of program-analysis problems over edge-labeled graphs. The InterDyck language represents an intersection of multiple matched-parenthesis languages (i.e., Dyck languages). In practice, program analyses typically leverage one Dyck language to achieve context-sensitivity, and other Dyck languages to model data dependences, such as field-sensitivity and pointer references/dereferences. In the ideal case, an InterDyck-reachability framework should model multiple Dyck languages simultaneously. Unfortunately, precise InterDyck-reachability is undecidable. Any practical solution must over-approximate the exact answer. In the literature, a lot of work has been proposed to over-approximate the InterDyck-reachability formulation. This paper offers a new perspective on improving both the precision and the scalability of InterDyck-reachability: we aim to simplify the underlying input graph G. Our key insight is based on the observation that if an edge is not contributing to any InterDyck-path, we can safely eliminate it from G. Our technique is orthogonal to the InterDyck-reachability formulation, and can serve as a pre-processing step with any over-approximating approaches for InterDyck-reachability. We have applied our graph simplification algorithm to pre-processing the graphs from a recent InterDyck-reachability-based taint analysis for Android. Our evaluation on three popular InterDyck-reachability algorithms yields promising results. In particular, our graph-simplification method improves both the scalability and precision of all three InterDyck-reachability algorithms, sometimes dramatically.

查看原文本刊更多论文

交错堤岸可达性的快速图化简

许多程序分析问题可以表述为图形可达性问题。交错戴克语言的可达性。交错Dyck语言可达性(interdyck -可达性)是一种基本框架，用于表达各种边标记图上的程序分析问题。InterDyck语言表示多种匹配括号语言(即Dyck语言)的交集。在实践中，程序分析通常利用一种Dyck语言来实现上下文敏感性，并利用其他Dyck语言来建模数据依赖性，例如字段敏感性和指针引用/解引用。在理想情况下，一个interdyck可达性框架应该同时对多种Dyck语言建模。不幸的是，准确的跨桥可达性是无法确定的。任何实际的解决方案都必须过于接近确切的答案。在文献中，已经提出了大量的工作，以过度逼近堤岸间可达性公式。本文提供了一个新的视角提高精度和可伸缩性InterDyck-reachability:我们的目标是简化底层输入图g .关键的观点是基于这样的观察:如果没有导致任何InterDyck-path优势,我们可以安全地从g .消除技术是正交InterDyck-reachability配方,并可以作为预处理步骤与任何over-approximating InterDyck-reachability方法。我们已经将我们的图形简化算法用于预处理最近基于interdyck可达性的Android污染分析的图形。我们对三种流行的dyck可达性算法的评估产生了有希望的结果。特别是，我们的图简化方法提高了所有三种interdyck可达性算法的可伸缩性和精度，有时甚至是显著的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation

自引率

0.00%

发文量