2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)最新文献

筛选
英文 中文
How does a typical tutorial for mobile development look like? 典型的手机开发教程是什么样的?
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597106
Rebecca Tiarks, W. Maalej
{"title":"How does a typical tutorial for mobile development look like?","authors":"Rebecca Tiarks, W. Maalej","doi":"10.1145/2597073.2597106","DOIUrl":"https://doi.org/10.1145/2597073.2597106","url":null,"abstract":"We report on an exploratory study, which aims at understanding how development tutorials are structured, what types of tutorials exist, and how official tutorials differ from tutorials written by development communities. We analyzed over 1.200 tutorials for mobile application development provided by six different sources for the three major platforms: Android, Apple iOS, and Windows Phone. We found that a typical tutorial contains around 2700 words distributed over 4 pages and including a list of instructions with 18 items. Overall, 70% of the tutorials contain source code examples and a similar fraction contain images. On average, one tutorial has 6 images. When analyzing the images, we found that the studied iOS community posted the largest number of images, 14 images per tutorial, on average, from which 74% are plain images, i.e., mainly screenshots without stencils, diagrams, or highlights. In contrast, 36% of the images included in the official tutorials by Apple were diagrams or images with stencils. Community sites seem to follow a similar structure to the official sites but include items and images which are rather underrepresented in the official tutorials. From the analysis of the tutorials content by means of natural language processing combined with manual content analysis, we derived four categories for mobile development tutorials: infrastructure and design, application and services, distribution and maintenance, and development platform. Our categorization can help tutorial writers to better organize and evaluate the content of their tutorials and identify missing tutorials.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"7 1","pages":"272-281"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74531301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Incremental origin analysis of source code files 增量来源分析的源代码文件
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597111
Daniela Steidl, B. Hummel, Elmar Jürgens
{"title":"Incremental origin analysis of source code files","authors":"Daniela Steidl, B. Hummel, Elmar Jürgens","doi":"10.1145/2597073.2597111","DOIUrl":"https://doi.org/10.1145/2597073.2597111","url":null,"abstract":"The history of software systems tracked by version control systems is often incomplete because many file movements are not recorded. However, static code analyses that mine the file history, such as change frequency or code churn, produce precise results only if the complete history of a source code file is available. In this paper, we show that up to 38.9% of the files in open source systems have an incomplete history, and we propose an incremental, commit-based approach to reconstruct the history based on clone information and name similarity. With this approach, the history of a file can be reconstructed across repository boundaries and thus provides accurate information for any source code analysis. We evaluate the approach in terms of correctness, completeness, performance, and relevance with a case study among seven open source systems and a developer survey.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"09 1","pages":"42-51"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85056969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Process mining multiple repositories for software defect resolution from control and organizational perspective 从控制和组织的角度为软件缺陷解决挖掘多个存储库的过程
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597081
Monika Gupta, A. Sureka, S. Padmanabhuni
{"title":"Process mining multiple repositories for software defect resolution from control and organizational perspective","authors":"Monika Gupta, A. Sureka, S. Padmanabhuni","doi":"10.1145/2597073.2597081","DOIUrl":"https://doi.org/10.1145/2597073.2597081","url":null,"abstract":"Issue reporting and resolution is a software engineering process supported by tools such as Issue Tracking System (ITS), Peer Code Review (PCR) system and Version Control System (VCS). Several open source software projects such as Google Chromium and Android follow process in which a defect or feature enhancement request is reported to an issue tracker followed by source-code change or patch review and patch commit using a version control system. We present an application of process mining three software repositories (ITS, PCR and VCS) from control flow and organizational perspective for effective process management. ITS, PCR and VCS are not explicitly linked so we implement regular expression based heuristics to integrate data from three repositories for Google Chromium project. We define activities such as bug reporting, bug fixing, bug verification, patch submission, patch review, and source code commit and create an event log of the bug resolution process. The extracted event log contains audit trail data such as caseID, timestamp, activity name and performer. We discover runtime process model for bug resolution process spanning three repositories using process mining tool, Disco, and conduct process performance and efficiency analysis. We identify bottlenecks, define and detect basic and composite anti-patterns. In addition to control flow analysis, we mine event log to perform organizational analysis and discover metrics such as handover of work, subcontracting, joint cases and joint activities.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"81 1","pages":"122-131"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80877785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
An empirical study of dormant bugs 对休眠细菌的实证研究
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597108
T. Chen, M. Nagappan, Emad Shihab, A. Hassan
{"title":"An empirical study of dormant bugs","authors":"T. Chen, M. Nagappan, Emad Shihab, A. Hassan","doi":"10.1145/2597073.2597108","DOIUrl":"https://doi.org/10.1145/2597073.2597108","url":null,"abstract":"Over the past decade, several research efforts have studied the quality of software systems by looking at post-release bugs. However, these studies do not account for bugs that remain dormant (i.e., introduced in a version of the software system, but are not found until much later) for years and across many versions. Such dormant bugs skew our under- standing of the software quality. In this paper we study dormant bugs against non-dormant bugs using data from 20 different open-source Apache foundation software systems. We find that 33% of the bugs introduced in a version are not reported till much later (i.e., they are reported in future versions as dormant bugs). Moreover, we find that 18.9% of the reported bugs in a version are not even introduced in that version (i.e., they are dormant bugs from prior versions). In short, the use of reported bugs to judge the quality of a specific version might be misleading. Exploring the fix process for dormant bugs, we find that they are fixed faster (median fix time of 5 days) than non- dormant bugs (median fix time of 8 days), and are fixed by more experienced developers (median commit counts of developers who fix dormant bug is 169% higher). Our results highlight that dormant bugs are different from non-dormant bugs in many perspectives and that future research in software quality should carefully study and consider dormant bugs.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"38 1","pages":"82-91"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88376507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 68
Classifying unstructured data into natural language text and technical information 将非结构化数据分类为自然语言文本和技术信息
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597112
T. Merten, Bastian Mager, Simone Bürsner, B. Paech
{"title":"Classifying unstructured data into natural language text and technical information","authors":"T. Merten, Bastian Mager, Simone Bürsner, B. Paech","doi":"10.1145/2597073.2597112","DOIUrl":"https://doi.org/10.1145/2597073.2597112","url":null,"abstract":"Software repository data, for example in issue tracking systems, include natural language text and technical information, which includes anything from log files via code snippets to stack traces. \u0000 However, data mining is often only interested in one of the two types e.g. in natural language text when looking at text mining. Regardless of which type is being investigated, any techniques used have to deal with noise caused by fragments of the other type i.e. methods interested in natural language have to deal with technical fragments and vice versa. \u0000 This paper proposes an approach to classify unstructured data, e.g. development documents, into natural language text and technical information using a mixture of text heuristics and agglomerative hierarchical clustering. \u0000 The approach was evaluated using 225 manually annotated text passages from developer emails and issue tracker data. Using white space tokenization as a basis, the overall precision of the approach is 0.84 and the recall is 0.85.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"41 1","pages":"300-303"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75569328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Revisiting Android reuse studies in the context of code obfuscation and library usages 在代码混淆和库用法的背景下重温Android重用研究
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597109
M. Vásquez, Andrew Holtzhauer, Carlos Bernal-Cárdenas, D. Poshyvanyk
{"title":"Revisiting Android reuse studies in the context of code obfuscation and library usages","authors":"M. Vásquez, Andrew Holtzhauer, Carlos Bernal-Cárdenas, D. Poshyvanyk","doi":"10.1145/2597073.2597109","DOIUrl":"https://doi.org/10.1145/2597073.2597109","url":null,"abstract":"In the recent years, studies of design and programming practices in mobile development are gaining more attention from researchers. Several such empirical studies used Android applications (paid, free, and open source) to analyze factors such as size, quality, dependencies, reuse, and cloning. Most of the studies use executable files of the apps (APK files), instead of source code because of availability issues (most of free apps available at the Android official market are not open-source, but still can be downloaded and analyzed in APK format). However, using only APK files in empirical studies comes with some threats to the validity of the results. In this paper, we analyze some of these pertinent threats. In particular, we analyzed the impact of third-party libraries and code obfuscation practices on estimating the amount of reuse by class cloning in Android apps. When including and excluding third-party libraries from the analysis, we found statistically significant differences in the amount of class cloning 24,379 free Android apps. Also, we found some evidence that obfuscation is responsible for increasing a number of false positives when detecting class clones. Finally, based on our findings, we provide a list of actionable guidelines for mining and analyzing large repositories of Android applications and minimizing these threats to validity","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"265 1","pages":"242-251"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72766230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 78
Improving the effectiveness of test suite through mining historical data 通过挖掘历史数据,提高测试套件的有效性
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597084
Jeff Anderson, Saeed Salem, Hyunsook Do
{"title":"Improving the effectiveness of test suite through mining historical data","authors":"Jeff Anderson, Saeed Salem, Hyunsook Do","doi":"10.1145/2597073.2597084","DOIUrl":"https://doi.org/10.1145/2597073.2597084","url":null,"abstract":"Software regression testing is an integral part of most major software projects. As projects grow larger and the number of tests increases, performing regression testing becomes more costly. If software engineers can identify and run tests that are more likely to detect failures during regression testing, they may be able to better manage their regression testing activities. In this paper, to help identify such test cases, we developed techniques that utilizes various types of information in software repositories. To assess our techniques, we conducted an empirical study using an industrial software product, Microsoft Dynamics AX, which contains real faults. Our results show that the proposed techniques can be effective in identifying test cases that are likely to detect failures.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"93 1","pages":"142-151"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84198609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Understanding software evolution: the maisqual ant data set 理解软件进化:海量数据集
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597136
B. Baldassari, P. Preux
{"title":"Understanding software evolution: the maisqual ant data set","authors":"B. Baldassari, P. Preux","doi":"10.1145/2597073.2597136","DOIUrl":"https://doi.org/10.1145/2597073.2597136","url":null,"abstract":"Software engineering is a maturing discipline which has seen many drastic advances in the last years. However, some studies still point to the lack of rigorous and mathematically grounded methods to raise the field to a new emerging science, with proper and reproducible foundations to build upon. Indeed, mathematicians and statisticians do not necessarily have software engineering knowledge, while software engineers and practitioners do not necessarily have a mathematical background. \u0000 The Maisqual research project intends to fill the gap between both fields by proposing a controlled and peer-reviewed data set series ready to use and study. These data sets feature metrics from different repositories, from source code to mail activity and configuration management meta data. Metrics are described and commented, and all the steps followed for their extraction and treatment are described with contextual information about the data and its meaning. \u0000 This article introduces the Apache Ant weekly data set, featuring 636 extracts of the project over 12 years at different levels of artefacts – application, files, functions. By associating community and process related information to code extracts, this data set unveils interesting perspectives on the evolution of one of the great success stories of open source.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"41 1","pages":"424-427"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78821177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Towards building a universal defect prediction model 建立一个通用的缺陷预测模型
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597078
Feng Zhang, A. Mockus, I. Keivanloo, Ying Zou
{"title":"Towards building a universal defect prediction model","authors":"Feng Zhang, A. Mockus, I. Keivanloo, Ying Zou","doi":"10.1145/2597073.2597078","DOIUrl":"https://doi.org/10.1145/2597073.2597078","url":null,"abstract":"To predict files with defects, a suitable prediction model must be built for a software project from either itself (within-project) or other projects (cross-project). A universal defect prediction model that is built from the entire set of diverse projects would relieve the need for building models for an individual project. A universal model could also be interpreted as a basic relationship between software metrics and defects. However, the variations in the distribution of predictors pose a formidable obstacle to build a universal model. Such variations exist among projects with different context factors (e.g., size and programming language). To overcome this challenge, we propose context-aware rank transformations for predictors. We cluster projects based on the similarity of the distribution of 26 predictors, and derive the rank transformations using quantiles of predictors for a cluster. We then fit the universal model on the transformed data of 1,398 open source projects hosted on SourceForge and GoogleCode. Adding context factors to the universal model improves the predictive power. The universal model obtains prediction performance comparable to the within-project models and yields similar results when applied on five external projects (one Apache and four Eclipse projects). These results suggest that a universal defect prediction model may be an achievable goal.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"42 1","pages":"182-191"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89724424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 146
An empirical study of just-in-time defect prediction using cross-project models 使用跨项目模型的即时缺陷预测的实证研究
2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI: 10.1145/2597073.2597075
Takafumi Fukushima, Yasutaka Kamei, Shane McIntosh, Kazuhiro Yamashita, Naoyasu Ubayashi
{"title":"An empirical study of just-in-time defect prediction using cross-project models","authors":"Takafumi Fukushima, Yasutaka Kamei, Shane McIntosh, Kazuhiro Yamashita, Naoyasu Ubayashi","doi":"10.1145/2597073.2597075","DOIUrl":"https://doi.org/10.1145/2597073.2597075","url":null,"abstract":"Prior research suggests that predicting defect-inducing changes, i.e., Just-In-Time (JIT) defect prediction is a more practical alternative to traditional defect prediction techniques, providing immediate feedback while design decisions are still fresh in the minds of developers. Unfortunately, similar to traditional defect prediction models, JIT models require a large amount of training data, which is not available when projects are in initial development phases. To address this flaw in traditional defect prediction, prior work has proposed cross-project models, i.e., models learned from older projects with sufficient history. However, cross-project models have not yet been explored in the context of JIT prediction. Therefore, in this study, we empirically evaluate the performance of JIT cross-project models. Through a case study on 11 open source projects, we find that in a JIT cross-project context: (1) high performance within-project models rarely perform well; (2) models trained on projects that have similar correlations between predictor and dependent variables often perform well; and (3) ensemble learning techniques that leverage historical data from several other projects (e.g., voting experts) often perform well. Our findings empirically confirm that JIT cross-project models learned using other projects are a viable solution for projects with little historical data. However, JIT cross-project models perform best when the data used to learn them is carefully selected.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"2068 1","pages":"172-181"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86549900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 163
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信