{"title":"Rebasing in Code Review Considered Harmful: A Large-Scale Empirical Investigation","authors":"M. Paixão, P. Maia","doi":"10.1109/SCAM.2019.00014","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00014","url":null,"abstract":"Code review has been widely acknowledged as a key quality assurance process in both open-source and industrial software development. Due to the asynchronicity of the code review process, the system's codebase tends to incorporate external commits while a source code change is reviewed, which cause the need for rebasing operations. External commits have the potential to modify files currently under review, which causes re-work for developers and fatigue for reviewers. Since source code changes observed during code review may be due to external commits, rebasing operations may pose a severe threat to empirical studies that employ code review data. Yet, to the best of our knowledge, there is no empirical study that characterises and investigates rebasing in real-world software systems. Hence, this paper reports an empirical investigation aimed at understanding the frequency in which rebasing operations occur and their side-effects in the reviewing process. To achieve so, we perform an in-depth large-scale empirical investigation of the code review data of 11 software systems, 28,808 code reviews and 99,121 revisions. Our observations indicate that developers need to perform rebasing operations in an average of 75.35% of code reviews. In addition, our data suggests that an average of 34.21% of rebasing operations tend to tamper with the reviewing process. Finally, we propose a methodology to handle rebasing in empirical studies that employ code review data. We show how an empirical study that does not account for rebasing operations may report skewed, biased and inaccurate observations.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115499756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Software Engineering by Source Transformation – Experience with TXL (Most Influential Paper, SCAM 2001)","authors":"J. Cordy, T. Dean, A. Malton, Kevin A. Schneider","doi":"10.1109/SCAM.2019.00024","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00024","url":null,"abstract":"Most influential paper of SCAM 2001.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"10 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114935535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatically Curated Data Sets","authors":"Marcus Kessel, C. Atkinson","doi":"10.1109/SCAM.2019.00015","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00015","url":null,"abstract":"o validate hypotheses and tools that depend on the semantics of software, it is necessary to assemble, prepare and maintain (i.e. curate) large, high-quality corpora of executable software systems exhibiting certain desired behavior and/or properties. Today this is a highly tedious and laborious activity requiring significant human time and effort. In this paper we therefore present a prototype platform that supports the notion of “live data sets” where almost all aspects of the data set curation process are automated. Instead of curating data sets by hand, or writing dedicated tools to select and check software samples on a case-by-case basis, a live data set allows users to simply describe their requirements as abstract scripts written in a declarative domain specific language. After explaining the approach and the key ideas behind its implementation, in this paper we present two examples of executable corpora generated automatically from a live data set populated from Maven Central. The first illustrates a “semantics agnostic” use case where the actual behavior of the software is unimportant, while the second illustrates a “semantics specific” use case where software implementing a specific functional abstraction is selected.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130036565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Tiwari, Jyoti Prakash, S. Groß, Christian Hammer
{"title":"LUDroid: A Large Scale Analysis of Android – Web Hybridization","authors":"A. Tiwari, Jyoti Prakash, S. Groß, Christian Hammer","doi":"10.1109/SCAM.2019.00036","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00036","url":null,"abstract":"Many Android applications embed webpages via WebView components and execute JavaScript code within Android. Hybrid applications leverage dedicated APIs to load a resource and render it in WebView. Furthermore, Android objects can be shared with the JavaScript world. However, bridging the interfaces of the Android and JavaScript world might also incur severe security threats: Potentially untrusted webpages and their JavaScript might interfere with the Android environment and its access to native features. No general analysis is currently available to assess the implications of such hybrid apps bridging the two worlds. To understand the semantics and effects of hybrid apps, we perform a large-scale study on the usage of the hybridization APIs in the wild. We analyze and categorize the parameters to hybridization APIs for 7,500 randomly selected applications from the Google Playstore. Our results advance the general understanding of hybrid applications, as well as implications for potential program analyses, and the current security situation: We discover 6,375 flows of sensitive data from Android to JavaScript, out of which 82% could flow to potentially untrustworthy code. Our analysis identified 365 web pages embedding vulnerabilities and we exemplarily exploit them. Additionally, we discover 653 applications in which potentially untrusted Javascript code may interfere with (trusted) Android objects.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121339861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diego Marcilio, Carlo A. Furia, R. Bonifácio, G. Pinto
{"title":"Automatically Generating Fix Suggestions in Response to Static Code Analysis Warnings","authors":"Diego Marcilio, Carlo A. Furia, R. Bonifácio, G. Pinto","doi":"10.1109/SCAM.2019.00013","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00013","url":null,"abstract":"Static code analysis tools such as FindBugs and SonarQube are widely used on open-source and industrial projects to detect a variety of issues that may negatively affect the quality of software. Despite these tools' popularity and high level of automation, several empirical studies report that developers normally fix only a small fraction (typically, less than 10% [1]) of the reported issues—so-called \"warnings\". If these analysis tools could also automatically provide suggestions on how to fix the issues that trigger some of the warnings, their feedback would become more actionable and more directly useful to developers. In this work, we investigate whether it is feasible to automatically generate fix suggestions for common warnings issued by static code analysis tools, and to what extent developers are willing to accept such suggestions into the codebases they're maintaining. To this end, we implemented a Java program transformation technique that fixes 11 distinct rules checked by two well-known static code analysis tools (SonarQube and SpotBugs). Fix suggestions are generated automatically based on templates, which are instantiated in a way that removes the source of the warnings; templates for some rules are even capable of producing multi-line patches. We submitted 38 pull requests, including 920 fixes generated automatically by our technique for various open-source Java projects, including the Eclipse IDE and both SonarQube and SpotBugs tools. At the time of writing, project maintainers accepted 84% of our fix suggestions (95% of them without any modifications). These results indicate that our approach to generating fix suggestions is feasible, and can help increase the applicability of static code analysis tools.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"20 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121009283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Less is More: From Multi-objective to Mono-objective Refactoring via Developer's Knowledge Extraction","authors":"Vahid Alizadeh, Houcem Fehri, M. Kessentini","doi":"10.1109/SCAM.2019.00029","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00029","url":null,"abstract":"Refactoring studies either aggregated quality metrics to evaluate possible code changes or treated them separately to find trade-offs. For the first category of work, it is challenging to define upfront the weights for the quality objectives since developers are not able to express them upfront. For the second category of work, the number of possible trade-offs between quality objectives is large which makes developers reluctant to look at many refactoring solutions. In this paper, we propose, for the first time, a way to convert multi-objective search into a mono-objective one after interacting with the developer to identify a good refactoring solution based on his preferences. The first step consists of using a multi-objective search to generate different possible refactoring strategies by finding a trade-off between several conflicting quality attributes. Then, an unsupervised learning algorithm clusters the different trade-off solutions, called the Pareto front, to guide the developers in selecting their region of interests and to reduce the number of refactoring options to explore. Finally, the extracted preferences from the developer are used to transform the multi-objective search into a mono-objective one by taking the preferred cluster of the Pareto front as the initial population for the mono-objective search and generating an evaluation function based on the weights that are automatically computed from the position of the cluster in the Pareto front. Thus, the developer will just interact with only one refactoring solution generated by the mono-objective search. We selected 32 participants to manually evaluate the effectiveness of our tool on 7 open source projects and one industrial project. The results show that the recommended refactorings are more accurate than the current state of the art.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121066682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introducing Privacy in Screen Event Frequency Analysis for Android Apps","authors":"Hailong Zhang, S. Latif, Raef Bassily, A. Rountev","doi":"10.1109/SCAM.2019.00037","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00037","url":null,"abstract":"Mobile apps often use analytics infrastructures provided by companies such as Google and Facebook to gather extensive fine-grained data about app performance and user behaviors. It is important to understand and enforce suitable trade-offs between the benefits of such data gathering (for app developers) and the corresponding privacy loss (for app users). Our work focuses on screen event frequency analysis, which is one of the most popular forms of data gathering in mobile app analytics. We propose a privacy-preserving version of such analysis using differential privacy (DP), a popular principled approach for creating privacy-preserving analyses. We describe how DP can be introduced in screen event frequency analysis for mobile apps, and demonstrate an instance of this approach for Android apps and the Google Analytics framework. Our work develops the automated app code analysis, code rewriting, and run-time processing needed to deploy the proposed DP solution. Experimental evaluation demonstrates that high accuracy and practical cost can be achieved by the developed privacy-preserving screen event frequency analysis.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130081568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Behave Nicely! Automatic Generation of Code for Behaviour Driven Development Test Suites","authors":"Tim Storer, Ruxandra Bob","doi":"10.1109/SCAM.2019.00033","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00033","url":null,"abstract":"Behaviour driven development (BDD) has gained widespread use in the software industry. System specifications can be expressed as test scenarios, describing the circumstances, actions and expected outcomes. These scenarios are written in a structured natural language (Gherkin), with each step in the scenario associated with a corresponding step implementation function in the underlying programming language. A challenge recognised by industry is ensuring that the natural language scenarios, step implementation functions and underlying system implementation remain consistent with one another, requiring on-going maintenance effort as changes are made to a system. To address this, we have developed behave_nicely, a tool, for automatically generating step implementation functions from structured natural language steps, with the intention of eliminating the need for maintaining step implementation functions. We evaluated our approach on a sample of 20 white box and 50 black box projects using behaviour driven development, drawn from GitHub. Our results show that behave_nicely can generate step implementation functions for 80% of the white box and 17% of black box projects. We conclude that (a) there is significant potential for automating the process of code generation for BDD tests and (b) that the development of guidelines for writing tests in Gherkin would significantly improve the results.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122630068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wanessa Teotônio, P. González, P. Maia, Pedro Muniz
{"title":"WAL: A Tool for Diagnosing Accessibility Issues and Evolving Legacy Web Systems at Runtime","authors":"Wanessa Teotônio, P. González, P. Maia, Pedro Muniz","doi":"10.1109/SCAM.2019.00028","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00028","url":null,"abstract":"The vast majority of pages available on the Web are developed under the tacit assumption that the final user will not have any kind of disability. Thus, accessibility aspects are not contemplated in their development process. As a consequence, many users encounter interaction barriers and cannot access much of the content in the Web. But the improvement of Websites can be challenging due to their high complexity and lack of documentation, the impossibility of accessing the source code, the use of out-of-dated technologies, and the fact that development teams are often unaware of accessibility rules. To mitigate this problem, this paper proposes a tool, called Website Accessibility Layer (WAL), which aims at both inserting accessibility features into existing Web pages at runtime and diagnosing accessibility issues through a dynamic source code analysis. The current version of the tool contributes to the accessibility of users with visual impairment, dyslexia, and reading difficulties, with respect to 13 Web page resources. The proposal has been validated by being applied to 40 widely accessed Websites with satisfactory results, proving the feasibility of the solution. Furthermore, real users also evaluated positively the benefits brought by the tool to their daily work activities.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128307220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Soumaya Rebai, Oussama Ben Sghaier, Vahid Alizadeh, M. Kessentini, Meriem Chater
{"title":"Interactive Refactoring Documentation Bot","authors":"Soumaya Rebai, Oussama Ben Sghaier, Vahid Alizadeh, M. Kessentini, Meriem Chater","doi":"10.1109/SCAM.2019.00026","DOIUrl":"https://doi.org/10.1109/SCAM.2019.00026","url":null,"abstract":"The documentation of code changes is significantly important but developers ignore it, most of the time, due to the pressure of the deadlines. While developers may document the most important features modification or bugs fixing, recent empirical studies show that the documentation of quality improvements and/or refactoring is often omitted or not accurately described. However, the automated or semi-automated documentation of refactorings has not been yet explored despite the extensive work on the remaining steps of refactoring including the detection, prioritization and recommendation. In this paper, we propose a semi-automated refactoring documentation bot that helps developers to interactively check and validate the documentation of the refactorings and/or quality improvements at the file level for each opened pull-request before being reviewed or merged to the master. The bot starts by checking the pullrequest if there are significant quality changes and refactorings at the file level and whether they are documented by the developer. Then, it checks the validity of the developers description of the refactorings, if any. Based on that analysis, the documentation bot will recommend a message to document the refactorings, their locations and the quality improvement for that pull-request when missing information is found. Then, the developer can modify his pull request description by interacting with the bot to accept/modify/reject part of the proposed documentation. Since refactoring do not happen in isolation most of the time, the bot is documenting the impact of a sequence of refactorings, in a pull-request, on quality and not each refactoring in isolation. We conducted a human survey with 14 active developers to manually evaluate the relevance and the correctness of our tool on different pull requests of 5 open source projects and one industrial system. The results show that the participants found that our bot facilitates the documentation of their quality-related changes and refactorings.","PeriodicalId":431316,"journal":{"name":"2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114673834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}