2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)最新文献_第2页

A Deeper Look into Bug Fixes: Patterns, Replacements, Deletions, and Additions 更深入地了解错误修复:模式、替换、删除和添加

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2903495

Mauricio Soto, Ferdian Thung, Chu-Pan Wong, Claire Le Goues, D. Lo

引用次数: 56

A Dataset of Simplified Syntax Trees for C# c#简化语法树的数据集

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2903507

Sebastian Proksch, Sven Amann, Sarah Nadi, M. Mezini

引用次数: 14

Does Your Configuration Code Smell? 你的配置代码有异味吗?

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901761

Tushar Sharma, Marios Fragkoulis, D. Spinellis

{"title":"Does Your Configuration Code Smell?","authors":"Tushar Sharma, Marios Fragkoulis, D. Spinellis","doi":"10.1145/2901739.2901761","DOIUrl":"https://doi.org/10.1145/2901739.2901761","url":null,"abstract":"Infrastructure as Code (IaC) is the practice of specifying computing system configurations through code, and managing them through traditional software engineering methods. The wide adoption of configuration management and increasing size and complexity of the associated code, prompt for assessing, maintaining, and improving the configuration code's quality. In this context, traditional software engineering knowledge and best practices associated with code quality management can be leveraged to assess and manage configuration code quality. We propose a catalog of 13 implementation and 11 design configuration smells, where each smell violates recommended best practices for configuration code. We analyzed 4,621 Puppet repositories containing 8.9 million lines of code and detected the cataloged implementation and design configuration smells. Our analysis reveals that the design configuration smells show 9% higher average co-occurrence among themselves than the implementation configuration smells. We also observed that configuration smells belonging to a smell category tend to co-occur with configuration smells belonging to another smell category when correlation is computed by volume of identified smells. Finally, design configuration smell density shows negative correlation whereas implementation configuration smell density exhibits no correlation with the size of a configuration management system.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"2006 1","pages":"189-200"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82436582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 129

Topic Modeling of NASA Space System Problem Reports: Research in Practice NASA空间系统问题报告的主题建模:实践研究

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901760

L. Layman, A. Nikora, Joshua Meek, T. Menzies

{"title":"Topic Modeling of NASA Space System Problem Reports: Research in Practice","authors":"L. Layman, A. Nikora, Joshua Meek, T. Menzies","doi":"10.1145/2901739.2901760","DOIUrl":"https://doi.org/10.1145/2901739.2901760","url":null,"abstract":"Problem reports at NASA are similar to bug reports: they capture defects found during test, post-launch operational anomalies, and document the investigation and corrective action of the issue. These artifacts are a rich source of lessons learned for NASA, but are expensive to analyze since problem reports are comprised primarily of natural language text. We apply {topic modeling to a corpus of NASA problem reports to extract trends in testing and operational failures. We collected 16,669 problem reports from six NASA space flight missions and applied Latent Dirichlet Allocation topic modeling to the document corpus. We analyze the most popular topics within and across missions, and how popular topics changed over the lifetime of a mission. We find that hardware material and flight software issues are common during the integration and testing phase, while ground station software and equipment issues are more common during the operations phase. We identify a number of challenges in topic modeling for trend analysis: 1) that the process of selecting the topic modeling parameters lacks definitive guidance, 2) defining semantically-meaningful topic labels requires non-trivial effort and domain expertise, 3) topic models derived from the combined corpus of the six missions were biased toward the larger missions, and 4) topics must be semantically distinct as well as cohesive to be useful. Nonetheless, topic modeling can identify problem themes within missions and across mission lifetimes, providing useful feedback to engineers and project managers.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"32 1","pages":"303-314"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84161841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

How the R Community Creates and Curates Knowledge: A Comparative Study of Stack Overflow and Mailing Lists R社区如何创造和管理知识:堆栈溢出和邮件列表的比较研究

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901772

A. Zagalsky, Carlos Gómez Teshima, D. Germán, M. Storey, Germán Poo-Caamaño

{"title":"How the R Community Creates and Curates Knowledge: A Comparative Study of Stack Overflow and Mailing Lists","authors":"A. Zagalsky, Carlos Gómez Teshima, D. Germán, M. Storey, Germán Poo-Caamaño","doi":"10.1145/2901739.2901772","DOIUrl":"https://doi.org/10.1145/2901739.2901772","url":null,"abstract":"One of the many effects of social media in software development is the flourishing of very large communities of practice where members share a common interest, such as programming languages, frameworks, and tools. These communities of practice use many different communication channels but little is known about how these communities create, share, and curate knowledge using such channels. In this paper, we report a qualitative study of how one community of practice—the R software development community—creates and curates knowledge associated with questions and answers (Q&A) in two of its main communication channels: the R-tag in Stack Overflow and the R-users mailing list. The results reveal that knowledge is created and curated in two main forms: participatory, where multiple members explicitly collaborate to build knowledge, and crowdsourced, where individuals work independently of each other. The contribution of this paper is a characterization of knowledge types that are exchanged by these communities of practice, including a description of the reasons why members choose one channel over the other. Finally, this paper enumerates a set of recommendations to assist practitioners in the use of multiple channels for Q&A.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"8 1","pages":"441-451"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83706004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

Judging a Commit by Its Cover: Correlating Commit Message Entropy with Build Status on Travis-CI 从外表判断提交:将提交信息熵与Travis-CI上的构建状态相关联

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2903493

E. Santos, Abram Hindle

引用次数: 32

Analysis of Exception Handling Patterns in Java Projects: An Empirical Study Java项目中异常处理模式的分析:一个实证研究

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2903499

Suman Nakshatri, Maithri Hegde, Sahithi Thandra

引用次数: 33

MUBench: A Benchmark for API-Misuse Detectors MUBench: api误用检测器的基准测试

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2903506

Sven Amann, Sarah Nadi, H. Nguyen, T. Nguyen, M. Mezini

引用次数: 78

Adressing Problems with External Validity of Repository Mining Studies Through a Smart Data Platform 通过智能数据平台解决存储库挖掘研究的外部有效性问题

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901753

Fabian Trautsch, S. Herbold, Philip Makedonski, J. Grabowski

{"title":"Adressing Problems with External Validity of Repository Mining Studies Through a Smart Data Platform","authors":"Fabian Trautsch, S. Herbold, Philip Makedonski, J. Grabowski","doi":"10.1145/2901739.2901753","DOIUrl":"https://doi.org/10.1145/2901739.2901753","url":null,"abstract":"Research in software repository mining has grown considerably the last decade. Due to the data-driven nature of this venue of investigation, we identified several problems within the current state-of-the-art that pose a threat to the external validity of results. The heavy re-use of data sets in many studies may invalidate the results in case problems with the data itself are identified. Moreover, for many studies data and/or the implementations are not available, which hinders a replication of the results and, thereby, decreases the comparability between studies. Even if all information about the studies is available, the diversity of the used tooling can make their replication even then very hard. Within this paper, we discuss a potential solution to these problems through a cloud-based platform that integrates data collection and analytics. We created the prototype SmartSHARK that implements our approach. Using SmartSHARK, we collected data from several projects and created different analytic examples. Within this article, we present SmartSHARK and discuss our experiences regarding the use of SmartSHARK and the mentioned problems.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"36 1","pages":"97-108"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75291273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Multi-extract and Multi-level Dataset of Mozilla Issue Tracking History Mozilla问题跟踪历史的多提取和多级数据集

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2016-05-14 DOI: 10.1145/2901739.2903502

Jiaxin Zhu, Minghui Zhou, Hong Mei

{"title":"Multi-extract and Multi-level Dataset of Mozilla Issue Tracking History","authors":"Jiaxin Zhu, Minghui Zhou, Hong Mei","doi":"10.1145/2901739.2903502","DOIUrl":"https://doi.org/10.1145/2901739.2903502","url":null,"abstract":"Many studies analyze issue tracking repositories to understand and support software development. To facilitate the analyses, we share a Mozilla issue tracking dataset covering a 15-year history. The dataset includes three extracts and multiple levels for each extract. The three extracts were retrieved through two channels, a front-end (web user interface (UI)), and a back-end (official database dump) of Mozilla Bugzilla at three different times. The variations (dynamics) among extracts provide space for researchers to reproduce and validate their studies, while revealing potential opportunities for studies that otherwise could not be conducted. We provide different data levels for each extract ranging from raw data to standardized data as well as to the calculated data level for targeting specific research questions. Data retrieving and processing scripts related to each data level are offered too. By employing the multi-level structure, analysts can more efficiently start an inquiry from the standardized level and easily trace the data chain when necessary (e.g., to verify if a phenomenon reflected by the data is an actual event). We applied this dataset to several published studies and intend to expand the multi-level and multi-extract feature to other software engineering datasets.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"34 1","pages":"472-475"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72759891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8