Proceedings of the ACM on Programming Languages最新文献_第5页

API-Driven Program Synthesis for Testing Static Typing Implementations 测试静态类型实现的 API 驱动程序综合

IF 1.8

Proceedings of the ACM on Programming Languages Pub Date : 2024-01-05 DOI: 10.1145/3632904

Thodoris Sotiropoulos, Stefanos Chaliasos, Zhendong Su

{"title":"API-Driven Program Synthesis for Testing Static Typing Implementations","authors":"Thodoris Sotiropoulos, Stefanos Chaliasos, Zhendong Su","doi":"10.1145/3632904","DOIUrl":"https://doi.org/10.1145/3632904","url":null,"abstract":"We introduce a novel approach for testing static typing implementations based on the concept of API-driven program synthesis. The idea is to synthesize type-intensive but small and well-typed programs by leveraging and combining application programming interfaces (APIs) derived from existing software libraries. Our primary insight is backed up by real-world evidence: a significant number of compiler typing bugs are caused by small test cases that employ APIs from the standard library of the language under test. This is attributed to the inherent complexity of the majority of these APIs, which often exercise a wide range of sophisticated type-related features. The main contribution of our approach is the ability to produce small client programs with increased feature coverage, without bearing the burden of generating the corresponding well-formed API definitions from scratch. To validate diverse aspects of static typing procedures (i.e., soundness, precision of type inference), we also enrich our API-driven approach with fault-injection and semantics-preserving modes, along with their corresponding test oracles. We evaluate our implemented tool, Thalia on testing the static typing implementations of the compilers for three popular languages, namely, Scala, Kotlin, and Groovy. Thalia has uncovered 84 typing bugs (77 confirmed and 22 fixed), most of which are triggered by test cases featuring APIs that rely on parametric polymorphism, overloading, and higher-order functions. Our comparison with state-of-the-art shows that Thalia yields test programs with distinct characteristics, offering additional and complementary benefits.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"52 7","pages":"1850 - 1881"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139384150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

The Bounded Pathwidth of Control-Flow Graphs 控制流图的有界路径宽度

Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI: 10.1145/3622807

Giovanna Kobus Conrado, Amir Kafshdar Goharshady, Chun Kit Lam

{"title":"The Bounded Pathwidth of Control-Flow Graphs","authors":"Giovanna Kobus Conrado, Amir Kafshdar Goharshady, Chun Kit Lam","doi":"10.1145/3622807","DOIUrl":"https://doi.org/10.1145/3622807","url":null,"abstract":"Pathwidth and treewidth are standard and well-studied graph sparsity parameters which intuitively model the degree to which a given graph resembles a path or a tree, respectively. It is well-known that the control-flow graphs of structured goto-free programs have a tree-like shape and bounded treewidth. This fact has been exploited to design considerably more efficient algorithms for a wide variety of static analysis and compiler optimization problems, such as register allocation, µ-calculus model-checking and parity games, data-flow analysis, cache management, and liftetime-optimal redundancy elimination. However, there is no bound in the literature for the pathwidth of programs, except the general inequality that the pathwidth of a graph is at most O (lg n ) times its treewidth, where n is the number of vertices of the graph. In this work, we prove that control-flow graphs of structured programs have bounded pathwidth and provide a linear-time algorithm to obtain a path decomposition of small width. Specifically, we establish a bound of 2 · d on the pathwidth of programs with nesting depth d . Since real-world programs have small nesting depth, they also have bounded pathwidth. This is significant for a number of reasons: (i) ‍pathwidth is a strictly stronger parameter than treewidth, i.e. ‍any graph family with bounded pathwidth has bounded treewidth, but the converse does not hold; (ii) ‍any algorithm that is designed with treewidth in mind can be applied to bounded-pathwidth graphs with no change; (iii) ‍there are problems that are fixed-parameter tractable with respect to pathwidth but not treewidth; (iv) ‍verification algorithms that are designed based on treewidth would become significantly faster when using pathwidth as the parameter; and (v) ‍it is easier to design algorithms based on bounded pathwidth since one does not have to consider the often-challenging case of merge nodes in treewidth-based dynamic programming. Thus, we invite the static analysis and compiler optimization communities to adopt pathwidth as their parameter of choice instead of, or in addition to, treewidth. Intuitively, control-flow graphs are not only tree-like, but also path-like and one can obtain simpler and more scalable algorithms by relying on path-likeness instead of tree-likeness. As a motivating example, we provide a simpler and more efficient algorithm for spill-free register allocation using bounded pathwidth instead of treewidth. Our algorithm reduces the runtime from O ( n · r 2 · tw · r + 2 · r ) to O ( n · pw · r pw · r + r + 1 ), where n is the number of lines of code, r is the number of registers, pw is the pathwidth of the control-flow graph and tw is its treewidth. We provide extensive experimental results showing that our approach is applicable to a wide variety of real-world embedded benchmarks from SDCC and obtains runtime improvements of 2-3 orders of magnitude. This is because the pathwidth is equal to the treewidth, or one more, in the ove","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136112664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Container-Usage-Pattern-Based Context Debloating Approach for Object-Sensitive Pointer Analysis 面向对象敏感指针分析的基于容器使用模式的上下文展开方法

Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI: 10.1145/3622832

Dongjie He, Yujiang Gui, Wei Li, Yonggang Tao, Changwei Zou, Yulei Sui, Jingling Xue

{"title":"A Container-Usage-Pattern-Based Context Debloating Approach for Object-Sensitive Pointer Analysis","authors":"Dongjie He, Yujiang Gui, Wei Li, Yonggang Tao, Changwei Zou, Yulei Sui, Jingling Xue","doi":"10.1145/3622832","DOIUrl":"https://doi.org/10.1145/3622832","url":null,"abstract":"In this paper, we introduce DebloaterX, a new approach for automatically identifying context-independent objects to debloat contexts in object-sensitive pointer analysis ( k obj). Object sensitivity achieves high precision, but its context construction mechanism combines objects with their contexts indiscriminately. This leads to a combinatorial explosion of contexts in large programs, resulting in inefficiency. Previous research has proposed a context-debloating approach that inhibits a pre-selected set of context-independent objects from forming new contexts, improving the efficiency of k obj. However, this earlier context-debloating approach under-approximates the set of context-independent objects identified, limiting performance speedups. We introduce a novel context-debloating pre-analysis approach that identifies objects as context-dependent only when they are potentially precision-critical to k obj based on three general container-usage patterns. Our research finds that objects containing no fields of ”abstract” (i.e., open) types can be analyzed context-insensitively with negligible precision loss in real-world applications. We provide clear rules and efficient algorithms to recognize these patterns, selecting more context-independent objects for better debloating. We have implemented DebloaterX in the Qilin framework and will release it as an open-source tool. Our experimental results on 12 standard Java benchmarks and real-world programs show that DebloaterX selects 92.4% of objects to be context-independent on average, enabling k obj to run significantly faster (an average of 19.3x when k = 2 and 150.2x when k = 3) and scale up to 8 more programs when k = 3, with only a negligible loss of precision (less than 0.2%). Compared to state-of-the-art alternative pre-analyses in accelerating k obj, DebloaterX outperforms Zipper significantly in both precision and efficiency and outperforms Conch (the earlier context-debloating approach) in efficiency substantially while achieving nearly the same precision.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136112813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How Domain Experts Use an Embedded DSL 领域专家如何使用嵌入式DSL

Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI: 10.1145/3622851

Lisa Rennels, Sarah E. Chasins

引用次数: 0

Rapid: Region-Based Pointer Disambiguation 快速:基于区域的指针消歧

Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI: 10.1145/3622859

Khushboo Chitre, Piyus Kedia, Rahul Purandare

{"title":"Rapid: Region-Based Pointer Disambiguation","authors":"Khushboo Chitre, Piyus Kedia, Rahul Purandare","doi":"10.1145/3622859","DOIUrl":"https://doi.org/10.1145/3622859","url":null,"abstract":"Interprocedural alias analyses often sacrifice precision for scalability. Thus, modern compilers such as GCC and LLVM implement more scalable but less precise intraprocedural alias analyses. This compromise makes the compilers miss out on potential optimization opportunities, affecting the performance of the application. Modern compilers implement loop-versioning with dynamic checks for pointer disambiguation to enable the missed optimizations. Polyhedral access range analysis and symbolic range analysis enable 𝑂 (1) range checks for non-overlapping of memory accesses inside loops. However, these approaches work only for the loops in which the loop bounds are loop invariants. To address this limitation, researchers proposed a technique that requires 𝑂 (𝑙𝑜𝑔 𝑛) memory accesses for pointer disambiguation. Others improved the performance of dynamic checks to single memory access by constraining the object size and alignment. However, the former approach incurs noticeable overhead due to its dynamic checks, whereas the latter has a noticeable allocator overhead. Thus, scalability remains a challenge. In this work, we present a tool, Rapid, that further reduces the overheads of the allocator and dynamic checks proposed in the existing approaches. The key idea is to identify objects that need disambiguation checks using a profiler and allocate them in different regions, which are disjoint memory areas. The disambiguation checks simply compare the regions corresponding to the objects. The regions are aligned such that the top 32 bits in the addresses of any two objects allocated in different regions are always different. As a consequence, the dynamic checks do not require any memory access to ensure that the objects belong to different regions, making them efficient. Rapid achieved a maximum performance benefit of around 52.94% for Polybench and 1.88% for CPU SPEC 2017 benchmarks. The maximum CPU overhead of our allocator is 0.57% with a geometric mean of -0.2% for CPU SPEC 2017 benchmarks. Due to the low overhead of the allocator and dynamic checks, Rapid could improve the performance of 12 out of 16 CPU SPEC 2017 benchmarks. In contrast, a state-of-the-art approach used in the comparison could improve only five CPU SPEC 2017 benchmarks.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136116388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programs 用于训练程序的神经代理的复杂性引导数据采样

Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI: 10.1145/3622856

Renda, Alex, Ding, Yi, Carbin, Michael

引用次数: 0

Perception Contracts for Safety of ML-Enabled Systems 基于机器学习的系统安全感知契约

Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI: 10.1145/3622875

Angello Astorga, Chiao Hsieh, P. Madhusudan, Sayan Mitra

引用次数: 3

Complete First-Order Reasoning for Properties of Functional Programs 函数程序性质的完全一阶推理

Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI: 10.1145/3622835

Adithya Murali, Lucas Peña, Ranjit Jhala, P. Madhusudan

引用次数: 0

Type-Safe Dynamic Placement with First-Class Placed Values 具有一级放置值的类型安全动态放置

Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI: 10.1145/3622873

George Zakhour, Pascal Weisenburger, Guido Salvaneschi

{"title":"Type-Safe Dynamic Placement with First-Class Placed Values","authors":"George Zakhour, Pascal Weisenburger, Guido Salvaneschi","doi":"10.1145/3622873","DOIUrl":"https://doi.org/10.1145/3622873","url":null,"abstract":"Several distributed programming language solutions have been proposed to reason about the placement of data, computations, and peers interaction. Such solutions include, among the others, multitier programming, choreographic programming and various approaches based on behavioral types. These methods statically ensure safety properties thanks to a complete knowledge about placement of data and computation at compile time. In distributed systems, however, dynamic placement of computation and data is crucial to enable performance optimizations, e.g., driven by data locality or in presence of a number of other constraints such as security and compliance regarding data storage location. Unfortunately, in existing programming languages, dynamic placement conflicts with static reasoning about distributed programs: the flexibility required by dynamic placement hinders statically tracking the location of data and computation. In this paper we present Dyno, a programming language that enables static reasoning about dynamic placement. Dyno features a type system where values are explicitly placed, but in contrast to existing approaches, placed values are also first class, ensuring that they can be passed around and referred to from other locations. Building on top of this mechanism, we provide a novel interpretation of dynamic placement as unions of placement types. We formalize type soundness, placement correctness (as part of type soundness) and architecture conformance. In case studies and benchmarks, our evaluation shows that Dyno enables static reasoning about programs even in presence of dynamic placement, ensuring type safety and placement correctness of programs at negligible performance cost. We reimplement an Android app with ∼ 7 K LOC in Dyno, find a bug in the existing implementation, and show that the app's approach is representative of a common way to implement dynamic placement found in over 100 apps in a large open-source app store.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136112807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards Better Semantics Exploration for Browser Fuzzing 面向浏览器模糊测试更好的语义探索

Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI: 10.1145/3622819

Chijin Zhou, Quan Zhang, Lihua Guo, Mingzhe Wang, Yu Jiang, Qing Liao, Zhiyong Wu, Shanshan Li, Bin Gu

{"title":"Towards Better Semantics Exploration for Browser Fuzzing","authors":"Chijin Zhou, Quan Zhang, Lihua Guo, Mingzhe Wang, Yu Jiang, Qing Liao, Zhiyong Wu, Shanshan Li, Bin Gu","doi":"10.1145/3622819","DOIUrl":"https://doi.org/10.1145/3622819","url":null,"abstract":"Web browsers exhibit rich semantics that enable a plethora of web-based functionalities. However, these intricate semantics present significant challenges for the implementation and testing of browsers. For example, fuzzing, a widely adopted testing technique, typically relies on handwritten context-free grammars (CFGs) for automatically generating inputs. However, these CFGs fall short in adequately modeling the complex semantics of browsers, resulting in generated inputs that cover only a portion of the semantics and are prone to semantic errors. In this paper, we present SaGe, an automated method that enhances browser fuzzing through the use of production-context sensitive grammars (PCSGs) incorporating semantic information. Our approach begins by extracting a rudimentary CFG from W3C standards and iteratively enhancing it to create a PCSG. The resulting PCSG enables our fuzzer to generate inputs that explore a broader range of browser semantics with a higher proportion of semantically-correct inputs. To evaluate the efficacy of SaGe, we conducted 24-hour fuzzing campaigns on mainstream browsers, including Chrome, Safari, and Firefox. Our approach demonstrated better performance compared to existing browser fuzzers, with a 6.03%-277.80% improvement in edge coverage, a 3.56%-161.71% boost in semantic correctness rate, twice the number of bugs discovered. Moreover, we identified 62 bugs across the three browsers, with 40 confirmed and 10 assigned CVEs.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136114709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0