Seongmin Lee, D. Binkley, R. Feldt, N. Gold, S. Yoo
{"title":"MOAD: Modeling Observation-Based Approximate Dependency","authors":"Seongmin Lee, D. Binkley, R. Feldt, N. Gold, S. Yoo","doi":"10.1109/SCAM.2019.00011","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00011","url":null,"abstract":"While dependency analysis is foundational to many applications of program analysis, the static nature of many existing techniques presents challenges such as limited scalability and inability to cope with multi-lingual systems. We present a novel dependency analysis technique that aims to approximate program dependency from a relatively small number of perturbed executions. Our technique, called MOAD (Modeling Observation-based Approximate Dependency), reformulates program dependency as the likelihood that one program element is dependent on another, instead of a more classical Boolean relationship. MOAD generates a set of program variants by deleting parts of the source code, and executes them while observing the impacts of the deletions on various program points. From these observations, MOAD infers a model of program dependency that captures the dependency relationship between the modification and observation points. While MOAD is a purely dynamic dependency analysis technique similar to Observation Based Slicing (ORBS), it does not require iterative deletions. Rather, MOAD makes a much smaller number of multiple, independent observations in parallel and infers dependency relationships for multiple program elements simultaneously, significantly reducing the cost of dynamic dependency analysis. We evaluate MOAD by instantiating program slices from the obtained probabilistic dependency model. Compared to ORBS, MOAD's model construction requires only 18.7% of the observations used by ORBS, while its slices are only 16% larger than the corresponding ORBS slice, on average.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132508714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating Automatic Fault Localization Using Markov Processes","authors":"Tim A. D. Henderson, Andy Podgurski, Yigit Küçük","doi":"10.1109/SCAM.2019.00021","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00021","url":null,"abstract":"Statistical fault localization (SFL) techniques are commonly compared and evaluated using a measure known as \"Rank Score\" and its associated evaluation process. In the latter process each SFL technique under comparison is used to produce a list of program locations, ranked by their suspiciousness scores. Each technique then receives a Rank Score for each faulty program it is applied to, which is equal to the rank of the first faulty location in the corresponding list. The SFL technique whose average Rank Score is lowest is judged the best overall, based on the assumption that a programmer will examine each location in rank order until a fault is found. However, this assumption oversimplifies how an SFL technique would be used in practice. Programmers are likely to regard suspiciousness ranks as just one source of information among several that are relevant to locating faults. This paper provides a new evaluation approach using first-order Markov models of debugging processes, which can incorporate multiple additional kinds of information, e.g., about code locality, dependences, or even intuition. Our approach, RT_rank, scores SFL techniques based on the expected number of steps a programmer would take through the Markov model before reaching a faulty location. Unlike previous evaluation methods, HT_rank can compare techniques even when they produce fault localization reports differing in structure or information granularity. To illustrate the approach, we present a case study comparing two existing fault localization techniques that produce results varying in form and granularity.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134413789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Architectural Security Tool Suite — ARCHSEC","authors":"Bernhard J. Berger, K. Sohr, R. Koschke","doi":"10.1109/SCAM.2019.00035","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00035","url":null,"abstract":"Architectural risk analysis is a risk management process for identifying security flaws at the level of software architectures and is used by large software vendors, to secure their products. We present our architectural security environ- ment (ARCHSEC) that has been developed at our institute during the past eight years in several research projects. ARCHSEC aims to simplify architectural risk analysis, making it easier for small and mid-sized companies to get started. With ARCHSEC, it is possible to graphically model or to reverse engineer software security architectures. The regained software architectures can then be inspected manually or au- tomatically analyzed w.r.t. security flaws, resulting in a threat model, which serves as a base for discussion between software and security experts to improve the overall security of the software system in question, beyond the level of implementation bugs. In the evaluation part of this paper, we demonstrate how we use ARCHSEC in two of our current research projects to analyze business applications. In the first project we use ARCHSEC to identify security flaws in business process diagrams. In the second project, ARCHSEC is integrated into an audit environment for software security certification. ARCHSEC is used to identify security flaws and to visualize software systems to improve the effectiveness and efficiency of the certification process.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133770764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Exploratory Study on Automatic Architectural Change Analysis Using Natural Language Processing Techniques","authors":"A. Mondal, B. Roy, Kevin A. Schneider","doi":"10.1109/SCAM.2019.00016","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00016","url":null,"abstract":"Continuous architecture is vital for developing large, complex software systems and supporting continuous delivery, integration, and testing practices. Researchers and practitioners investigate models and rules for managing change to support architecture continuity. They employ manual techniques to analyze software change, categorizing the changes as perfective, corrective, adaptive, and preventive. However, a manual approach is impractical for analyzing systems involving thousands of artefacts as it is time-consuming, labor-intensive, and error-prone. In this paper, we investigate whether an automatic technique incorporating free-form natural language text (e.g., developers' communication and commit messages) is an effective solution for architectural change analysis. Our experiments with multiple projects showed encouraging results for detecting architectural messages using our proposed language model. Although architectural change categorization for the preventive class is moderate, the outcome for the random dataset is insignificant in general (around a 45% F1 score). We investigated the causes of the unpromising outcome. Overall, our study reveals that our automated architectural change analysis tool would be fruitful only if the developers provide considerable technical details in the commit messages or other text.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121061906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicolas Harrand, César Soto-Valero, Monperrus Martin, B. Baudry
{"title":"The Strengths and Behavioral Quirks of Java Bytecode Decompilers","authors":"Nicolas Harrand, César Soto-Valero, Monperrus Martin, B. Baudry","doi":"10.1109/SCAM.2019.00019","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00019","url":null,"abstract":"During compilation from Java source code to bytecode, some information is irreversibly lost. In other words, compilation and decompilation of Java code is not symmetric. Consequently, the decompilation process, which aims at producing source code from bytecode, must establish some strategies to reconstruct the information that has been lost. Modern Java decompilers tend to use distinct strategies to achieve proper decompilation. In this work, we hypothesize that the diverse ways in which bytecode can be decompiled has a direct impact on the quality of the source code produced by decompilers. We study the effectiveness of eight Java decompilers with respect to three quality indicators: syntactic correctness, syntactic distortion and semantic equivalence modulo inputs. This study relies on a benchmark set of 14 real-world open-source software projects to be decompiled (2041 classes in total). Our results show that no single modern decompiler is able to correctly handle the variety of bytecode structures coming from real-world programs. Even the highest ranking decompiler in this study produces syntactically correct output for 84% of classes of our dataset and semantically equivalent code output for 78% of classes.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132145265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Study on the Effects of Exception Usage in Open-Source C++ Systems","authors":"Kirsten Bradley, Michael W. Godfrey","doi":"10.1109/SCAM.2019.00010","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00010","url":null,"abstract":"Exception handling (EH) is a feature common to many modern programming languages, including C++, Java, and Python, that allows error handling in client code to be performed in a way that is both systematic and largely detached from the implementation of the main functionality. However, C++ developers sometimes choose not to use EH, as they feel that its use increases complexity of the resulting code: new control flow paths are added to the code, \"stack unwinding\" adds extra responsibilities for the developer to worry about, and EH arguably detracts from the modular design of the system. In this paper, we perform an exploratory empirical study of the effects of exceptions usage in 2721 open source C++ systems taken from GitHub. We observed that the number of edges in an augmented call graph increases, on average, by 22% when edges for exception flow are added to a graph. Additionally, about 8 out of 9 functions that may throw only do so by propagating a throw from another function. These results suggest that, in practice, the use of C++ EH can add complexity to the design of the system that developers must strive to be aware of.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125549180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vineeth Kashyap, Jason Ruchti, Lucja Kot, Emma Turetsky, R. Swords, David Melski, Eric Schulte
{"title":"Automated Customized Bug-Benchmark Generation","authors":"Vineeth Kashyap, Jason Ruchti, Lucja Kot, Emma Turetsky, R. Swords, David Melski, Eric Schulte","doi":"10.1109/SCAM.2019.00020","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00020","url":null,"abstract":"We introduce Bug-Injector, a system that automatically creates benchmarks for customized evaluation of static analysis tools. We share a benchmark generated using Bug-Injector and illustrate its efficacy by using it to evaluate the recall of two leading open-source static analysis tools: Clang Static Analyzer and Infer. Bug-Injector works by inserting bugs based on bug templates into real-world host programs. It runs tests on the host program to collect dynamic traces, searches the traces for a point where the state satisfies the preconditions for some bug template, then modifies the host program to \"inject\" a bug based on that template. Injected bugs are used as test cases in a static analysis tool evaluation benchmark. Every test case is accompanied by a program input that exercises the injected bug. We have identified a broad range of requirements and desiderata for bug benchmarks; our approach generates on-demand test benchmarks that meet these requirements. It also allows us to create customized benchmarks suitable for evaluating tools for a specific use case (e.g., a given codebase and set of bug types). Our experimental evaluation demonstrates the suitability of our generated benchmark for evaluating static bug-detection tools and for comparing the performance of different tools.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134430368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}