{"title":"Mutation-Based Graph Inference for Fault Localization","authors":"Vincenzo Musco, Monperrus Martin, P. Preux","doi":"10.1109/SCAM.2016.24","DOIUrl":"https://doi.org/10.1109/SCAM.2016.24","url":null,"abstract":"We present a new fault localization algorithm, called Vautrin, built on an approximation of causality based on call graphs. The approximation of causality is done using software mutants. The key idea is that if a mutant is killed by a test, certain call graph edges within a path between the mutation point and the failing test are likely causal. We evaluate our approach on the fault localization benchmark by Steimann et al. totaling 5,836 faults. The causal graphs are extracted from 88,732 nodes connected by 119,531 edges. Vautrin improves the fault localization effectiveness for all subjects of the benchmark. Considering the wasted effort at the method level, a classical fault localization evaluation metric, the improvement ranges from 3% to 55%, with an average improvement of 14%.","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130491812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effects Dependence Graph: A Key Data Concept for C Source-to-Source Compilers","authors":"Nelson Lossing, P. Guillou, F. Irigoin","doi":"10.1109/SCAM.2016.20","DOIUrl":"https://doi.org/10.1109/SCAM.2016.20","url":null,"abstract":"Optimizations, transformations and analyses are applied to programs by compilers at the intermediate representation level, which usually does not include explicit variable declarations. This description level is fine for middle-ends and for source-to-source optimizers of simple languages. Meanwhile, the C language has become much more flexible since the C99 standard, and let variable and type declarations appear almost anywhere in source code. We present in this paper a new concept to manage C99 declarations in a source-to-source compiler: the Effects Dependence Graph, which is an extension of the classical Data Dependence Graph. It deals particularly efficiently with user-defined type declarations or dependent types like Variable-Length Array. It is also interesting because no legal scheduling transformation is hindered and because existing algorithms are either not or slightly modified. Finally it reduces the need for variable, struct and array privatization or live range analyses in automatic parallelizers. To the best of our knowledge, the declaration issue is ignored in the literature: existing C source-to-source compilers either do not support C99, or accept only restricted portions of code, and production compilers use low-level intermediate representations, possibly with annotations. In this way our solution addresses a wider range of compiler analysis issues.","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124929845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gábor Antal, David Havas, István Siket, Árpád Beszédes, R. Ferenc, József Mihalicza
{"title":"Transforming C++11 Code to C++03 to Support Legacy Compilation Environments","authors":"Gábor Antal, David Havas, István Siket, Árpád Beszédes, R. Ferenc, József Mihalicza","doi":"10.1109/SCAM.2016.11","DOIUrl":"https://doi.org/10.1109/SCAM.2016.11","url":null,"abstract":"Newer technologies - programming languages, environments, libraries - change very rapidly. However, various internal and external constraints often prevent projects from quickly adopting to these changes. Customers may require specific platform compatibility from a software vendor, for example. In this work, we deal with such an issue in the context of the C++ programming language. Our industrial partner is required to use SDKs that support only older C++ language editions. They, however, would like to allow their developers to use the newest language constructs in their code. To address this problem, we created a source code transformation framework to automatically backport source code written according to the C++11 standard to its functionally equivalent C++03 variant. With our framework developers are free to exploit the latest language features, while production code is still built by using a restricted set of available language constructs. This paper reports on the technical details of the transformation engine, and our experiences in applying it on two large industrial code bases and four open-source systems. Our solution is freely available and open-source.","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117271402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gergõ Balogh, T. Gergely, Árpád Beszédes, T. Gyimóthy
{"title":"Are My Unit Tests in the Right Package?","authors":"Gergõ Balogh, T. Gergely, Árpád Beszédes, T. Gyimóthy","doi":"10.1109/SCAM.2016.10","DOIUrl":"https://doi.org/10.1109/SCAM.2016.10","url":null,"abstract":"The software development industry has adopted written and de facto standards for creating effective and maintainable unit tests. Unfortunately, like any other source code artifact, they are often written without conforming to these guidelines, or they may evolve into such a state. In this work, we address a specific type of issues related to unit tests. We seek to automatically uncover violations of two fundamental rules: 1) unit tests should exercise only the unit they were designed for, and 2) they should follow a clear packaging convention. Our approach is to use code coverage to investigate the dynamic behaviour of the tests with respect to the code elements of the program, and use this information to identify highly correlated groups of tests and code elements (using community detection algorithm). This grouping is then compared to the trivial grouping determined by package structure, and any discrepancies found are treated as \"bad smells.\" We report on our related measurements on a set of large open source systems with notable unit test suites, and provide guidelines through examples for refactoring the problematic tests.","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128533217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Case for Software Specific Natural Language Techniques","authors":"D. Binkley, Dawn J Lawrie","doi":"10.1109/SCAM.2016.27","DOIUrl":"https://doi.org/10.1109/SCAM.2016.27","url":null,"abstract":"For over two decades, software engineering (SE) researchers have been importing tools and techniques from information retrieval (IR). Initial results have been quite positive. For example, when applied to problems such as feature location or re-establishing traceability links, IR techniques work well on their own, and often even better in combination with more traditional source code analysis techniques such as static and dynamic analysis. However, recently there has been growing awareness among SE researchers that IR tools and techniques are designed to work under a different set of assumptions than those that hold for a software system. Thus it may be beneficial to consider IR inspired tools and techniques that are specifically designed to work with software. One aim of this work is to provide quantitative empirical evidence in support of this observation. To do so a new technique is introduced that captures the level of difficulty found in an information need, the true, often latent, information that a searcher desires to know. The new technique is used to compare two test collections: the natural language TREC 8 collection and the software engineering JabRef collection. Analysis of the data leads to three significant findings. First, the variation in difficulty of the SE information needs is much larger than that of the natural language information needs, second, the most challenging of the SE information needs is far easier than the least challenging of the natural language information needs, and finally, variations of the queries used to uncover a latent information need have far less impact in the natural language collection than in the software engineering collection.","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129181046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. C. D. Paula, E. Guerra, C. Lopes, Hitesh Sajnani, Otávio Augusto Lazzarini Lemos
{"title":"An Exploratory Study of Interface Redundancy in Code Repositories","authors":"A. C. D. Paula, E. Guerra, C. Lopes, Hitesh Sajnani, Otávio Augusto Lazzarini Lemos","doi":"10.1109/SCAM.2016.31","DOIUrl":"https://doi.org/10.1109/SCAM.2016.31","url":null,"abstract":"An important property of software repositories is their level of cross-project redundancy. For instance, much has been done to assess how much code cloning happens across software corpora. In this paper we study a much less targeted type of replication: Interface Redundancy (IR). IR refers to the level of repetition of whole method interfaces - return type, method name, and parameters types - across a code corpus. Such type of redundancy is important because if two non-trivial methods ever share the same interface it is very likely that they implement analogous functions, even though their code, structure, or vocabulary might be diverse. A certain level of IR is a requirement for approaches that rely on the recurrence of interfaces to fulfill a given task (e.g., interface-driven code search - IDCS). In this paper we report on an experiment to measure IR in a large-scale Java repository. Our target corpus contains more than 380,000 methods from 99 Java projects extracted randomly from an open source repository. Results are promising as they show that the chances of an interface from a non-trivial method to repeat itself across a large repository is around 25% (i.e., approximately 1/4 of such interfaces are redundant). Also, more than 80% of the target projects contained IR (with the average percentage of redundant interfaces for these projects being above 30%). As additional analyses we investigated the distribution of the different types of redundant interfaces (e.g., intra-vs inter-project), characterized the redundant interfaces and show that such a knowledge can help improve IDCS, and provided evidence that only a very small part of IR refers to method cloning (around 0.002%).","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116691627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin Holland, Ganesh Ram Santhanam, Payas Awadhutkar, S. Kothari
{"title":"Statically-Informed Dynamic Analysis Tools to Detect Algorithmic Complexity Vulnerabilities","authors":"Benjamin Holland, Ganesh Ram Santhanam, Payas Awadhutkar, S. Kothari","doi":"10.1109/SCAM.2016.23","DOIUrl":"https://doi.org/10.1109/SCAM.2016.23","url":null,"abstract":"Algorithmic Complexity (AC) vulnerabilities can be exploited to cause a denial of service attack. Specifically, an adversary can design an input to trigger excessive (space/time) resource consumption. It is not possible to build a fully automated tool to detect AC vulnerabilities. Since it is an open-ended problem, a human-in-loop exploration is required to find the program loops that could have AC vulnerabilities. Ascertaining whether an arbitrary loop has an AC vulnerability is itself difficult, which is equivalent to the halting problem. This paper is about a pragmatic engineering approach to detect AC vulnerabilities. It presents a statically-informed dynamic (SID) analysis and two tools that provide critical capabilities for detecting AC vulnerabilities. The first is a static analysis tool for exploring the software to find loops as the potential candidates for AC vulnerabilities. The second is a dynamic analysis tool that can try many different inputs to evaluate the selected loops for excessive resource consumption. The two tools are built and integrated together using the interactive software analysis, transformation, and visualization capabilities provided by the Atlas platform. The paper describes two use cases for the tools, one to detect AC vulnerabilities in Java bytecode and another for students in an undergraduate algorithm class to perform experiments to learn different aspects of algorithmic complexity Tool and Demo Video: https://ensoftcorp.github.io/SID.","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"226 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131445269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Allan Blanchard, N. Kosmatov, Matthieu Lemerre, F. Loulergue
{"title":"Conc2Seq: A Frama-C Plugin for Verification of Parallel Compositions of C Programs","authors":"Allan Blanchard, N. Kosmatov, Matthieu Lemerre, F. Loulergue","doi":"10.1109/SCAM.2016.18","DOIUrl":"https://doi.org/10.1109/SCAM.2016.18","url":null,"abstract":"Frama-C is an extensible modular framework for analysis of C programs that offers different analyzers in the form of collaborating plugins. Currently, Frama-C does not support the proof of functional properties of concurrent code. We present Conc2Seq, a new code transformation based tool realized as a Frama-C plugin and dedicated to the verification of concurrent C programs. Assuming the program under verification respects an interleaving semantics, Conc2Seq transforms the original concurrent C program into a sequential one in which concurrency is simulated by interleavings. User specifications are automatically reintegrated into the new code without manual intervention. The goal of the proposed code transformation technique is to allow the user to reason about a concurrent program through the interleaving semantics using existing Frama-C analyzers.","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131372222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Security Perspective on Code Review: The Case of Chromium","authors":"M. D. Biase, M. Bruntink, Alberto Bacchelli","doi":"10.1109/SCAM.2016.30","DOIUrl":"https://doi.org/10.1109/SCAM.2016.30","url":null,"abstract":"Modern Code Review (MCR) is an established software development process that aims to improve software quality. Although evidence showed that higher levels of review coverage relates to less post-release bugs, it remains unknown the effectiveness of MCR at specifically finding security issues. We present a work we conduct aiming to fill that gap by exploring the MCR process in the Chromium open source project. We manually analyzed large sets of registered (114 cases) and missed (71 cases) security issues by backtracking in the project's issue, review, and code histories. This enabled us to qualify MCR in Chromium from the security perspective from several angles: Are security issues being discussed frequently? What categories of security issues are often missed or found? What characteristics of code reviews appear relevant to the discovery rate? Within the cases we analyzed, MCR in Chromium addresses security issues at a rate of 1% of reviewers' comments. Chromium code reviews mostly tend to miss language-specific issues (eg C++ issues and buffer overflows) and domain-specific ones (eg such as Cross-Site Scripting), when code reviews address issues, mostly they address those that pertain to the latter type. Initial evidence points to reviews conducted by more than 2 reviewers being more successful at finding security issues.","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"423 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133322774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stefan Fischer, L. Linsbauer, R. Lopez-Herrejon, Alexander Egyed
{"title":"A Source Level Empirical Study of Features and Their Interactions in Variable Software","authors":"Stefan Fischer, L. Linsbauer, R. Lopez-Herrejon, Alexander Egyed","doi":"10.1109/SCAM.2016.16","DOIUrl":"https://doi.org/10.1109/SCAM.2016.16","url":null,"abstract":"Robust and effective support for the detection and management of features and their interactions is crucial for many software development tasks but has proven to be an elusive goal despite the extensive research and practice on the subject. Providing the required support becomes even more challenging with variable software whereby multiple variants of a system and their features must be collectively considered. An important premise to provide better support for feature interactions in variable systems is the need of a deeper understanding on how features interact at different levels starting from the source level. In this context, recent work has looked at feature interactions from different angles and for different purposes, for instance for developing performance models, extracting interfaces for maintenance or describing feature evolution patterns. However, there is a gap in understanding how features interact in fact at the source level in contrast with how features ought to interact according to variability models that describe the valid combinations of features in variable software systems. In this paper we perform an empirical study to explore this gap. We use seven case studies, implemented in Java and C, totalling over nine million LoC, and analysed over seven thousand feature interactions. Our study revealed important inconsistencies between how feature interactions occur at source level and how they are modeled, and corroborated that the majority of source level interactions involve less than three features. We discuss the implications of our findings and avenues for further research.","PeriodicalId":407579,"journal":{"name":"2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128522873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}