{"title":"Mining and Extraction of Personal Software Process Measures through IDE Interaction Logs","authors":"Alireza Joonbakhsh, A. Sami","doi":"10.1145/3196398.3196462","DOIUrl":"https://doi.org/10.1145/3196398.3196462","url":null,"abstract":"The Personal Software Process (PSP) is an effective software process improvement method that heavily relies on manual collection of software development data. This paper describes a semi-automated method that reduces the burden of PSP data collection by extracting the required time and size of PSP measurements from IDE interaction logs. The tool mines enriched event data streams so can be easily generalized to other developing environment also. In addition, the proposed method is adaptable to phase definition changes and creates activity visualizations and summarizations that are helpful for software project management. Tools and processed data used for this paper are available on GitHub at: https://github.com/unknowngithubuser1/data.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"30 1","pages":"78-81"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87657904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paola R. G. Accioly, Paulo Borba, L. Silva, Guilherme Cavalcanti
{"title":"Analyzing Conflict Predictors in Open-Source Java Projects","authors":"Paola R. G. Accioly, Paulo Borba, L. Silva, Guilherme Cavalcanti","doi":"10.1145/3196398.3196437","DOIUrl":"https://doi.org/10.1145/3196398.3196437","url":null,"abstract":"In collaborative development environments integration conflicts occur frequently. To alleviate this problem, different awareness tools have been proposed to alert developers about potential conflicts before they become too complex. However, there is not much empirical evidence supporting the strategies used by these tools. Learning about what types of changes most likely lead to conflicts might help to derive more appropriate requirements for early conflict detection, and suggest improvements to existing conflict detection tools. To bring such evidence, in this paper we analyze the effectiveness of two types of code changes as conflict predictors. Namely, editions to the same method, and editions to directly dependent methods. We conduct an empirical study analyzing part of the development history of 45 Java projects from GitHub and Travis CI, including 5,647 merge scenarios, to compute the precision and recall for the conflict predictors aforementioned. Our results indicate that the predictors combined have a precision of 57.99% and a recall of 82.67%. Moreover, we conduct a manual analysis which provides insights about strategies that could further increase the precision and the recall.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"85 1","pages":"576-586"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84340739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Patch-Flow Method for Measuring Inner Source Collaboration","authors":"Maximilian Capraro, Michael Dorner, D. Riehle","doi":"10.1145/3196398.3196417","DOIUrl":"https://doi.org/10.1145/3196398.3196417","url":null,"abstract":"Inner source (IS) is the use of open source software development (SD) practices and the establishment of an open source-like culture within an organization. IS enables and requires developers to collaborate more than traditional SD methods such as plan-driven or agile development. To better understand IS, researchers and practitioners need to measure IS collaboration. However, there is no method yet for doing so. In this paper, we present a method for measuring IS collaboration by measuring the patch-flow within an organization. Patch-flow is the flow of code contributions across organizational boundaries such as project, organizational unit, or profit center boundaries. We evaluate our patch-flow measurement method using case study research with a software developing multi-industry company. By applying the method in the case organization, we evaluate its relevance and viability and discuss its usefulness. We found that about half (47.9%) of all code contributions constitute patch-flow between organizational units, almost all (42.2%) being between organizational units working on different products. Such significant patch-flow indicates high relevance of the patch-flow phenomenon and hence the method presented in this paper. Our patch-flow measurement method is the first of its kind to measure and quantify IS collaboration. It can serve as a base for further quantitative analyses of IS collaboration.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"67 1","pages":"515-525"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86336339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Geiger, I. Malavolta, L. Pascarella, Fabio Palomba, D. D. Nucci, Alberto Bacchelli
{"title":"A Graph-Based Dataset of Commit History of Real-World Android apps","authors":"F. Geiger, I. Malavolta, L. Pascarella, Fabio Palomba, D. D. Nucci, Alberto Bacchelli","doi":"10.1145/3196398.3196460","DOIUrl":"https://doi.org/10.1145/3196398.3196460","url":null,"abstract":"Obtaining a good dataset to conduct empirical studies on the engineering of Android apps is an open challenge. To start tackling this challenge, we present AndroidTimeMachine, the rst, self-contained, publicly available dataset weaving spread-out data sources about real-world, open-source Android apps. Encoded as a graph-based database, AndroidTimeMachine concerns 8,431 real open-source Android apps and contains: (i) metadata about the apps' GitHub projects, (ii) Git repositories with full commit history and (iii) metadata extracted from the Google Play store, such as app ratings and permissions.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":"30-33"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89419316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fiorella Zampetti, Alexander Serebrenik, M. D. Penta
{"title":"Was Self-Admitted Technical Debt Removal a Real Removal? An In-Depth Perspective","authors":"Fiorella Zampetti, Alexander Serebrenik, M. D. Penta","doi":"10.1145/3196398.3196423","DOIUrl":"https://doi.org/10.1145/3196398.3196423","url":null,"abstract":"Technical Debt (TD) has been defined as \"code being not quite right yet\", and its presence is often self-admitted by developers through comments. The purpose of such comments is to keep track of TD and appropriately address it when possible. Building on a previous quantitative investigation by Maldonado et al. on the removal of self-admitted technical debt (SATD), in this paper we perform an in-depth quantitative and qualitative study of how SATD is addressed in five Java open source projects. On the one hand, we look at whether SATD is \"accidentally\" removed, and the extent to which the SATD removal is being documented. We found that that (i) between 20% and 50% of SATD comments are accidentally removed while entire classes or methods are dropped, (ii) 8% of the SATD removal is acknowledged in commit messages, and (iii) while most of the changes addressing SATD require complex source code changes, very often SATD is addressed by specific changes to method calls or conditionals. Our results can be used to better plan TD management or learn patterns for addressing certain kinds of TD and provide recommendations to developers.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"102 1","pages":"526-536"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80634154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Design Structure Matrix Approach for Measuring Co-change-Modularity of Software Products","authors":"R. Benkoczi, D. Gaur, S. Hossain, M. A. Khan","doi":"10.1145/3196398.3196409","DOIUrl":"https://doi.org/10.1145/3196398.3196409","url":null,"abstract":"Several authors have quantified the modularity of software systems in terms of coupling and cohesion metrics. Most of these approaches focus on functional and procedural dependencies in the system. Although highly relevant at the design phase, these static dependencies alone do not account for how a software product evolves over time. Instead, this is also dictated by logical and hidden dependencies between system files. To a large extent, the co-change (co-commit) relation captures these different types of dependencies. In this paper, we define two measures of co-change-modularity of a software product based on a weighted design structure matrix (DSM). The first metric, called the weighted propagation cost, uses matrix exponential to measure how changes to one system file potentially affect the whole product. The second metric, called the weighted clustering cost, uses the output of the first metric to measure the partitionability of the system based on the co-change relation. In addition, we provide a visual representation of how the co-change structure of a system evolves over time. We discuss the theoretical foundation of our work and highlight its advantages over existing methodologies. We apply our approach to GNU Octave and show the findings to be consistent with the available literature on the evolution of Octave. Our analysis is extensible and applicable to a range of scenarios including open source systems.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"3 1","pages":"331-335"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88648957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Pascarella, Fabio Palomba, M. D. Penta, Alberto Bacchelli
{"title":"How Is Video Game Development Different from Software Development in Open Source?","authors":"L. Pascarella, Fabio Palomba, M. D. Penta, Alberto Bacchelli","doi":"10.1145/3196398.3196418","DOIUrl":"https://doi.org/10.1145/3196398.3196418","url":null,"abstract":"Recent research has provided evidence that, in the industrial context, developing video games diverges from developing software systems in other domains, such as office suites and system utilities. In this paper, we consider video game development in the open source system (OSS) context. Specifically, we investigate how developers contribute to video games vs. non-games by working on different kinds of artifacts, how they handle malfunctions, and how they perceive the development process of their projects. To this purpose, we conducted a mixed, qualitative and quantitative study on a broad suite of 60 OSS projects. Our results confirm the existence of significant differences between game and non-game development, in terms of how project resources are organized and in the diversity of developers' specializations. Moreover, game developers responding to our survey perceive more difficulties than other developers when reusing code as well as performing automated testing, and they lack a clear overview of their system's requirements.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"2 1","pages":"392-402"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81529859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing Requirements and Traceability Information to Improve Bug Localization","authors":"M. Rath, D. Lo, Patrick Mäder","doi":"10.1145/3196398.3196415","DOIUrl":"https://doi.org/10.1145/3196398.3196415","url":null,"abstract":"Locating bugs in industry-size software systems is time consuming and challenging. An automated approach for assisting the process of tracing from bug descriptions to relevant source code benefits developers. A large body of previous work aims to address this problem and demonstrates considerable achievements. Most existing approaches focus on the key challenge of improving techniques based on textual similarity to identify relevant files. However, there exists a lexical gap between the natural language used to formulate bug reports and the formal source code and its comments. To bridge this gap, state-of-the-art approaches contain a component for analyzing bug history information to increase retrieval performance. In this paper, we propose a novel approach TraceScore that also utilizes projects' requirements information and explicit dependency trace links to further close the gap in order to relate a new bug report to defective source code files. Our evaluation on more than 13,000 bug reports shows, that TraceScore significantly outperforms two state-of-the-art methods. Further, by integrating TraceScore into an existing bug localization algorithm, we found that TraceScore significantly improves retrieval performance by 49% in terms of mean average precision (MAP).","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"62 1 1","pages":"442-453"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75219814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Agnieszka Ciborowska, Nicholas A. Kraft, Kostadin Damevski
{"title":"Detecting and Characterizing Developer Behavior Following Opportunistic Reuse of Code Snippets from the Web","authors":"Agnieszka Ciborowska, Nicholas A. Kraft, Kostadin Damevski","doi":"10.1145/3196398.3196467","DOIUrl":"https://doi.org/10.1145/3196398.3196467","url":null,"abstract":"Modern software development is social and relies on many online resources and tools. In this paper, we study opportunistic code reuse from the Web, e.g. when developers copy code snippets from popular Q&A sites and paste them into their projects. Our focus is the behavior of developers following opportunistic code reuse, which reveals the success or failure of the action. We study developer behavior via a large, representative dataset of micro-interactions in the IDE. Our analysis of developer behavior exhibited in this dataset confirms laboratory study observations that code reuse from the Web is followed by heavy editing, in some cases by a rapid undo, and rarely by the execution of tests.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"228 4 1","pages":"94-97"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72682906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ripon K. Saha, Yingjun Lyu, Wing Lam, H. Yoshida, M. Prasad
{"title":"Bugs.jar: A Large-Scale, Diverse Dataset of Real-World Java Bugs","authors":"Ripon K. Saha, Yingjun Lyu, Wing Lam, H. Yoshida, M. Prasad","doi":"10.1145/3196398.3196473","DOIUrl":"https://doi.org/10.1145/3196398.3196473","url":null,"abstract":"We present Bugs.jar, a large-scale dataset for research in automated debugging, patching, and testing of Java programs. Bugs.jar is comprised of 1,158 bugs and patches, drawn from 8 large, popular opensource Java projects, spanning 8 diverse and prominent application categories. It is an order of magnitude larger than Defects4J, the only other dataset in its class. We discuss the methodology used for constructing Bugs.jar, the representation of the dataset, several use-cases, and an illustration of three of the use-cases through the application of 3 specific tools on Bugs.jar, namely our own tool, Elixir, and two third-party tools, Ekstazi and JaCoCo.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":"10-13"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89206966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}