{"title":"PMAF: an algebraic framework for static analysis of probabilistic programs","authors":"Di Wang, Jan Hoffmann, T. Reps","doi":"10.1145/3192366.3192408","DOIUrl":"https://doi.org/10.1145/3192366.3192408","url":null,"abstract":"Automatically establishing that a probabilistic program satisfies some property ϕ is a challenging problem. While a sampling-based approach—which involves running the program repeatedly—can suggest that ϕ holds, to establish that the program satisfies ϕ, analysis techniques must be used. Despite recent successes, probabilistic static analyses are still more difficult to design and implement than their deterministic counterparts. This paper presents a framework, called PMAF, for designing, implementing, and proving the correctness of static analyses of probabilistic programs with challenging features such as recursion, unstructured control-flow, divergence, nondeterminism, and continuous distributions. PMAF introduces pre-Markov algebras to factor out common parts of different analyses. To perform interprocedural analysis and to create procedure summaries, PMAF extends ideas from non-probabilistic interprocedural dataflow analysis to the probabilistic setting. One novelty is that PMAF is based on a semantics formulated in terms of a control-flow hyper-graph for each procedure, rather than a standard control-flow graph. To evaluate its effectiveness, PMAF has been used to reformulate and implement existing intraprocedural analyses for Bayesian-inference and the Markov decision problem, by creating corresponding interprocedural analyses. Additionally, PMAF has been used to implement a new interprocedural linear expectation-invariant analysis. Experiments with benchmark programs for the three analyses demonstrate that the approach is practical.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73257417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas Brutschy, Dimitar Dimitrov, Peter Müller, Martin T. Vechev
{"title":"Static serializability analysis for causal consistency","authors":"Lucas Brutschy, Dimitar Dimitrov, Peter Müller, Martin T. Vechev","doi":"10.1145/3192366.3192415","DOIUrl":"https://doi.org/10.1145/3192366.3192415","url":null,"abstract":"Many distributed databases provide only weak consistency guarantees to reduce synchronization overhead and remain available under network partitions. However, this leads to behaviors not possible under stronger guarantees. Such behaviors can easily defy programmer intuition and lead to errors that are notoriously hard to detect. In this paper, we propose a static analysis for detecting non-serializable behaviors of applications running on top of causally-consistent databases. Our technique is based on a novel, local serializability criterion and combines a generalization of graph-based techniques from the database literature with another, complementary analysis technique that encodes our serializability criterion into first-order logic formulas to be checked by an SMT solver. This analysis is more expensive yet more precise and produces concrete counter-examples. We implemented our methods and evaluated them on a number of applications from two different domains: cloud-backed mobile applications and clients of a distributed database. Our experiments demonstrate that our analysis is able to detect harmful serializability violations while producing only a small number of false alarms.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75237694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CUBA: interprocedural Context-UnBounded Analysis of concurrent programs","authors":"Peizun Liu, T. Wahl","doi":"10.1145/3192366.3192419","DOIUrl":"https://doi.org/10.1145/3192366.3192419","url":null,"abstract":"A classical result by Ramalingam about synchronization-sensitive interprocedural program analysis implies that reachability for concurrent threads running recursive procedures is undecidable. A technique proposed by Qadeer and Rehof, to bound the number of context switches allowed between the threads, leads to an incomplete solution that is, however, believed to catch “most bugs” in practice. The question whether the technique can also prove the absence of bugs at least in some cases has remained largely open. In this paper we introduce a broad verification methodology for resource-parameterized programs that observes how changes to the resource parameter affect the behavior of the program. Applied to the context-unbounded analysis problem (CUBA), the methodology results in partial verification techniques for procedural concurrent programs. Our solutions may not terminate, but are able to both refute and prove context-unbounded safety for concurrent recursive threads. We demonstrate the effectiveness of our method using a variety of examples, the safe of which cannot be proved safe by earlier, context-bounded methods.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"388 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79526957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongyu Liu, Sam Silvestro, Wei Wang, Chen Tian, Tongping Liu
{"title":"iReplayer: in-situ and identical record-and-replay for multithreaded applications","authors":"Hongyu Liu, Sam Silvestro, Wei Wang, Chen Tian, Tongping Liu","doi":"10.1145/3192366.3192380","DOIUrl":"https://doi.org/10.1145/3192366.3192380","url":null,"abstract":"Reproducing executions of multithreaded programs is very challenging due to many intrinsic and external non-deterministic factors. Existing RnR systems achieve significant progress in terms of performance overhead, but none targets the in-situ setting, in which replay occurs within the same process as the recording process. Also, most existing work cannot achieve identical replay, which may prevent the reproduction of some errors. This paper presents iReplayer, which aims to identically replay multithreaded programs in the original process (under the \"in-situ\" setting). The novel in-situ and identical replay of iReplayer makes it more likely to reproduce errors, and allows it to directly employ debugging mechanisms (e.g. watchpoints) to aid failure diagnosis. Currently, iReplayer only incurs 3% performance overhead on average, which allows it to be always enabled in the production environment. iReplayer enables a range of possibilities, and this paper presents three examples: two automatic tools for detecting buffer overflows and use-after-free bugs, and one interactive debugging tool that is integrated with GDB.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86809499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav
{"title":"A general path-based representation for predicting program properties","authors":"Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav","doi":"10.1145/3192366.3192412","DOIUrl":"https://doi.org/10.1145/3192366.3192412","url":null,"abstract":"Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming, and increase programmer productivity. A major challenge when learning from programs is how to represent programs in a way that facilitates effective learning. We present a general path-based representation for learning from programs. Our representation is purely syntactic and extracted automatically. The main idea is to represent a program using paths in its abstract syntax tree (AST). This allows a learning model to leverage the structured nature of code rather than treating it as a flat sequence of tokens. We show that this representation is general and can: (i) cover different prediction tasks, (ii) drive different learning algorithms (for both generative and discriminative models), and (iii) work across different programming languages. We evaluate our approach on the tasks of predicting variable names, method names, and full types. We use our representation to drive both CRF-based and word2vec-based learning, for programs of four languages: JavaScript, Java, Python and C#. Our evaluation shows that our approach obtains better results than task-specific handcrafted representations across different tasks and programming languages.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89262926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Search, align, and repair: data-driven feedback generation for introductory programming exercises","authors":"Ke Wang, Rishabh Singh, Z. Su","doi":"10.1145/3192366.3192384","DOIUrl":"https://doi.org/10.1145/3192366.3192384","url":null,"abstract":"This paper introduces the “Search, Align, and Repair” data-driven program repair framework to automate feedback generation for introductory programming exercises. Distinct from existing techniques, our goal is to develop an efficient, fully automated, and problem-agnostic technique for large or MOOC-scale introductory programming courses. We leverage the large amount of available student submissions in such settings and develop new algorithms for identifying similar programs, aligning correct and incorrect programs, and repairing incorrect programs by finding minimal fixes. We have implemented our technique in the Sarfgen system and evaluated it on thousands of real student attempts from the Microsoft-DEV204.1x edX course and the Microsoft CodeHunt platform. Our results show that Sarfgen can, within two seconds on average, generate concise, useful feedback for 89.7% of the incorrect student submissions. It has been integrated with the Microsoft-DEV204.1X edX class and deployed for production use.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85969243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Active learning of points-to specifications","authors":"O. Bastani, Rahul Sharma, A. Aiken, Percy Liang","doi":"10.1145/3192366.3192383","DOIUrl":"https://doi.org/10.1145/3192366.3192383","url":null,"abstract":"When analyzing programs, large libraries pose significant challenges to static points-to analysis. A popular solution is to have a human analyst provide points-to specifications that summarize relevant behaviors of library code, which can substantially improve precision and handle missing code such as native code. We propose Atlas, a tool that automatically infers points-to specifications. Atlas synthesizes unit tests that exercise the library code, and then infers points-to specifications based on observations from these executions. Atlas automatically infers specifications for the Java standard library, and produces better results for a client static information flow analysis on a benchmark of 46 Android apps compared to using existing handwritten specifications.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81900246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The semantics of transactions and weak memory in x86, Power, ARM, and C++","authors":"Nathan Chong, Tyler Sorensen, John Wickerson","doi":"10.1145/3192366.3192373","DOIUrl":"https://doi.org/10.1145/3192366.3192373","url":null,"abstract":"Weak memory models provide a complex, system-centric semantics for concurrent programs, while transactional memory (TM) provides a simpler, programmer-centric semantics. Both have been studied in detail, but their combined semantics is not well understood. This is problematic because such widely-used architectures and languages as x86, Power, and C++ all support TM, and all have weak memory models. Our work aims to clarify the interplay between weak memory and TM by extending existing axiomatic weak memory models (x86, Power, ARMv8, and C++) with new rules for TM. Our formal models are backed by automated tooling that enables (1) the synthesis of tests for validating our models against existing implementations and (2) the model-checking of TM-related transformations, such as lock elision and compiling C++ transactions to hardware. A key finding is that a proposed TM extension to ARMv8 currently being considered within ARM Research is incompatible with lock elision without sacrificing portability or performance.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78178123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alex Sanchez-Stern, P. Panchekha, Sorin Lerner, Zachary Tatlock
{"title":"Finding root causes of floating point error","authors":"Alex Sanchez-Stern, P. Panchekha, Sorin Lerner, Zachary Tatlock","doi":"10.1145/3192366.3192411","DOIUrl":"https://doi.org/10.1145/3192366.3192411","url":null,"abstract":"Floating-point arithmetic plays a central role in science, engineering, and finance by enabling developers to approximate real arithmetic. To address numerical issues in large floating-point applications, developers must identify root causes, which is difficult because floating-point errors are generally non-local, non-compositional, and non-uniform. This paper presents Herbgrind, a tool to help developers identify and address root causes in numerical code written in low-level languages like C/C++ and Fortran. Herbgrind dynamically tracks dependencies between operations and program outputs to avoid false positives and abstracts erroneous computations to simplified program fragments whose improvement can reduce output error. We perform several case studies applying Herbgrind to large, expert-crafted numerical programs and show that it scales to applications spanning hundreds of thousands of lines, correctly handling the low-level details of modern floating point hardware and mathematical libraries and tracking error across function boundaries and through the heap.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"303 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77198711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}