{"title":"Querying sequential software engineering data","authors":"Chengnian Sun, Haidong Zhang, Jian-Guang Lou, Hongyu Zhang, Qiang Wang, D. Zhang, Siau-Cheng Khoo","doi":"10.1145/2635868.2635902","DOIUrl":"https://doi.org/10.1145/2635868.2635902","url":null,"abstract":"We propose a pattern-based approach to effectively and efficiently analyzing sequential software engineering (SE) data. Different from other types of SE data, sequential SE data preserves unique temporal properties, which cannot be easily analyzed without much programming effort. In order to facilitate the analysis of sequential SE data, we design a sequential pattern query language (SPQL), which specifies the temporal properties based on regular expressions, and is enhanced with variables and statements to store and manipulate matching states. We also propose a query engine to effectively process the SPQL queries. We have applied our approach to analyze two types of SE data, namely bug report history and source code change history. We experiment with 181,213 Eclipse bug reports and 323,989 code revisions of Android. SPQL enables us to explore interesting temporal properties underneath these sequential data with a few lines of query code and low matching overhead. The analysis results can help better under- stand a software process and identify process violations.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129782867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving oracle quality by detecting brittle assertions and unused inputs in tests","authors":"Chen Huo, J. Clause","doi":"10.1145/2635868.2635917","DOIUrl":"https://doi.org/10.1145/2635868.2635917","url":null,"abstract":"Writing oracles is challenging. As a result, developers often create oracles that check too little, resulting in tests that are unable to detect failures, or check too much, resulting in tests that are brittle and difficult to maintain. In this paper we present a new technique for automatically analyzing test oracles. The technique is based on dynamic tainting and detects both brittle assertions—assertions that depend on values that are derived from uncontrolled inputs—and unused inputs—inputs provided by the test that are not checked by an assertion. We also presented OraclePolish, an implementation of the technique that can analyze tests that are written in Java and use the JUnit testing framework. Using OraclePolish, we conducted an empirical evaluation of more than 4000 real test cases. The results of the evaluation show that OraclePolish is effective; it detected 164 tests that contain brittle assertions and 1618 tests that have unused inputs. In addition, the results also demonstrate that the costs associated with using the technique are reasonable.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126178691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Earl T. Barr, Yuriy Brun, Premkumar T. Devanbu, M. Harman, Federica Sarro
{"title":"The plastic surgery hypothesis","authors":"Earl T. Barr, Yuriy Brun, Premkumar T. Devanbu, M. Harman, Federica Sarro","doi":"10.1145/2635868.2635898","DOIUrl":"https://doi.org/10.1145/2635868.2635898","url":null,"abstract":"Recent work on genetic-programming-based approaches to automatic program patching have relied on the insight that the content of new code can often be assembled out of fragments of code that already exist in the code base. This insight has been dubbed the plastic surgery hypothesis; successful, well-known automatic repair tools such as GenProg rest on this hypothesis, but it has never been validated. We formalize and validate the plastic surgery hypothesis and empirically measure the extent to which raw material for changes actually already exists in projects. In this paper, we mount a large-scale study of several large Java projects, and examine a history of 15,723 commits to determine the extent to which these commits are graftable, i.e., can be reconstituted from existing code, and find an encouraging degree of graftability, surprisingly independent of commit size and type of commit. For example, we find that changes are 43% graftable from the exact version of the software being changed. With a view to investigating the difficulty of finding these grafts, we study the abundance of such grafts in three possible sources: the immediately previous version, prior history, and other projects. We also examine the contiguity or chunking of these grafts, and the degree to which grafts can be found in the same file. Our results are quite promising and suggest an optimistic future for automatic program patching methods that search for raw material in already extant code in the project being patched.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117097321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qi Xuan, A. Okano, Premkumar T. Devanbu, V. Filkov
{"title":"Focus-shifting patterns of OSS developers and their congruence with call graphs","authors":"Qi Xuan, A. Okano, Premkumar T. Devanbu, V. Filkov","doi":"10.1145/2635868.2635914","DOIUrl":"https://doi.org/10.1145/2635868.2635914","url":null,"abstract":"Developers in complex, self-organized open-source projects often work on many different files, and over time switch focus between them. Shifting focus can have impact on the software quality and productivity, and is thus an important topic of investigation. In this paper, we study focus shifting patterns (FSPs) of developers by comparing trace data from a dozen open source software (OSS) projects of their longitudinal commit activities and file dependencies from the projects call graphs. Using information theoretic measures of network structure, we find that fairly complex focus-shifting patterns emerge, and FSPs in the same project are more similar to each other. We show that developers tend to shift focus along with, rather than away from, software dependency links described by the call graphs. This tendency becomes weaker as either the interval between successive commits, or the organizational distance between committed files (i.e. directory distance), gets larger. Interestingly, this tendency appears stronger with more productive developers. We hope our study will initiate interest in further understanding of FSPs, which can ultimately help to (1) improve current recommender systems to predict the next focus of developers, and (2) provide insight into better call graph design, so as to facilitate developers' work.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"09 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130565688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining micro-practices from operational data","authors":"Minghui Zhou, A. Mockus","doi":"10.1145/2635868.2666611","DOIUrl":"https://doi.org/10.1145/2635868.2666611","url":null,"abstract":"Micro-practices are actual (and usually undocumented or incorrectly documented) activity patterns used by individuals or projects to accomplish basic software development tasks, such as writing code, testing, triaging bugs, or mentoring newcomers. The operational data in software repositories presents the tantalizing possibility to discover such fine-scale behaviors and use them to understand and improve software development. We propose a large-scale evidence-based approach to accomplish this by first creating a mirror of the projects in the open source universe. The next step would involve the inductive generalization from in-depth studies of specific projects from one side and the categorization of micro-practices in the entire universe from the other side.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130761543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selection and presentation practices for code example summarization","authors":"Annie T. T. Ying, M. Robillard","doi":"10.1145/2635868.2635877","DOIUrl":"https://doi.org/10.1145/2635868.2635877","url":null,"abstract":"Code examples are an important source for answering questions about software libraries and applications. Many usage contexts for code examples require them to be distilled to their essence: e.g., when serving as cues to longer documents, or for reminding developers of a previously known idiom. We conducted a study to discover how code can be summarized and why. As part of the study, we collected 156 pairs of code examples and their summaries from 16 participants, along with over 26 hours of think-aloud verbalizations detailing the decisions of the participants during their summarization activities. Based on a qualitative analysis of this data we elicited a list of practices followed by the participants to summarize code examples and propose empirically-supported hypotheses justifying the use of specific practices. One main finding was that none of the participants exclusively extracted code verbatim for the summaries, motivating abstractive summarization. The results provide a grounded basis for the development of code example summarization and presentation technology.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130663232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianwen Li, Yinbo Yao, G. Pu, Lijun Zhang, Jifeng He
{"title":"Aalta: an LTL satisfiability checker over Infinite/Finite traces","authors":"Jianwen Li, Yinbo Yao, G. Pu, Lijun Zhang, Jifeng He","doi":"10.1145/2635868.2661669","DOIUrl":"https://doi.org/10.1145/2635868.2661669","url":null,"abstract":"Linear Temporal Logic (LTL) is been widely used nowadays in verification and AI. Checking satisfiability of LTL formulas is a fundamental step in removing possible errors in LTL assertions. We present in this paper Aalta, a new LTL satisfiability checker, which supports satisfiability checking for LTL over both infinite and finite traces. Aalta leverages the power of modern SAT solvers. We have conducted a comprehensive comparison between Aalta and other LTL satisfiability checkers, and the experimental results show that Aalta is very competitive. The tool is available at www.lab205.org/aalta.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129199857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Experiences developing tools for developers (invited talk)","authors":"J. Penix","doi":"10.1145/2635868.2684429","DOIUrl":"https://doi.org/10.1145/2635868.2684429","url":null,"abstract":"Software Engineers are horrible customers for software tools. If they don't like your tools, they will just write their own. If your tool wastes a few minutes of a developer's day, good luck getting them to ever try your tool again. And if, after years of effort, you manage to develop tools they actually like, you are really in trouble. This is when they start building systems on top of your tools. No API? No problem! They will hack and scrape as needed to get their job done. In this talk I'll go through a number of examples of successes, non-successes and over-successes from the past 8 years of evolving the developer infrastructure at Google. I'll highlight the challenges we faced, our attempts to address the challenges and share our lessons learned.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116710650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic mining of specifications from invocation traces and method invariants","authors":"Ivo Krka, Yuriy Brun, N. Medvidović","doi":"10.1145/2635868.2635890","DOIUrl":"https://doi.org/10.1145/2635868.2635890","url":null,"abstract":"Software library documentation often describes individual methods' APIs, but not the intended protocols and method interactions. This can lead to library misuse, and restrict runtime detection of protocol violations and automated verification of software that uses the library. Specification mining, if accurate, can help mitigate these issues, which has led to significant research into new model-inference techniques that produce FSM-based models from program invariants and execution traces. However, there is currently a lack of empirical studies that, in a principled way, measure the impact of the inference strategies on model quality. To this end, we identify four such strategies and systematically study the quality of the models they produce for nine off-the-shelf libraries. We find that (1) using invariants to infer an initial model significantly improves model quality, increasing precision by 4% and recall by 41%, on average; (2) effective invariant filtering is crucial for quality and scalability of strategies that use invariants; and (3) using traces in combination with invariants greatly improves robustness to input noise. We present our empirical evaluation, implement new and extend existing model-inference techniques, and make public our implementations, ground-truth models, and experimental data. Our work can lead to higher-quality model inference, and directly improve the techniques and tools that rely on model inference.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117202883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Static analysis driven performance and energy testing","authors":"Abhijeet Banerjee","doi":"10.1145/2635868.2666602","DOIUrl":"https://doi.org/10.1145/2635868.2666602","url":null,"abstract":"Software testing is the process of evaluating the properties of a software. Properties of a software can be divided into two categories: functional properties and non-functional properties. Properties that influence the input-output relationship of the software can be categorized as functional properties. On the other hand, properties that do not influence the input-output relationship of the software directly can be categorized as non-functional properties. In context of real-time system software, testing functional as well as non functional properties is equally important. Over the years considerable amount of research effort has been dedicated in developing tools and techniques that systematically test various functional properties of a software. However, the same cannot be said about testing non-functional properties. Systematic testing of non-functional properties is often much more challenging than testing functional properties. This is because non-functional properties not only depends on the inputs to the program but also on the underlying hardware. Additionally, unlike the functional properties, nonfunctional properties are seldom annotated in the software itself. Such challenges provide the objectives for this work. The primary objective of this work is to explore and address the major challenges in testing non-functional properties of a software.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123598642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}