Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering最新文献

筛选
英文 中文
An Evaluation of Parameter Pruning Approaches for Software Estimation 软件估计中参数修剪方法的评价
Thu D. Tran, Vu Nguyen, Thong Truong, C. Tran, Phu Le
{"title":"An Evaluation of Parameter Pruning Approaches for Software Estimation","authors":"Thu D. Tran, Vu Nguyen, Thong Truong, C. Tran, Phu Le","doi":"10.1145/3345629.3345633","DOIUrl":"https://doi.org/10.1145/3345629.3345633","url":null,"abstract":"Model-based estimation often uses impact factors and historical data to predict the effort of new projects. Estimation accuracy of this approach is highly dependent on how well impact factors are selected. This paper comparatively assesses six methods for prune parameters of effort estimation models, including Stepwise regression, Lasso, constrained regression, GRASP, Tabu search, and PCA. Four data sets were used for evaluation, showing that estimation accuracy varies among the methods but no method consistently outperforms the rest. Stepwise regression prunes estimation model parameters the most while it does not sacrifice much estimation performance. Our study provides further evidence to support the use of Stepwise regression for selecting factors in effort estimation.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116415284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Applying Cross Project Defect Prediction Approaches to Cross-Company Effort Estimation 跨项目缺陷预测方法在跨公司工作量估算中的应用
S. Amasaki, Tomoyuki Yokogawa, Hirohisa Aman
{"title":"Applying Cross Project Defect Prediction Approaches to Cross-Company Effort Estimation","authors":"S. Amasaki, Tomoyuki Yokogawa, Hirohisa Aman","doi":"10.1145/3345629.3345638","DOIUrl":"https://doi.org/10.1145/3345629.3345638","url":null,"abstract":"BACKGROUND: Prediction systems in software engineering often suffer from the shortage of suitable data within a project. A promising solution is transfer learning that utilizes data from outside the project. Many transfer learning approaches have been proposed for defect prediction known as cross-project defect prediction (CPDP). In contrast, a few approaches have been proposed for software effort estimation known as cross-company software effort estimation (CCSEE). Both CCSEE and CPDP are engaged in a similar problem, and a few CPDP approaches are applicable as CCSEE in actual. It is thus beneficial for improving CCSEE performance to examine how well CPDP approaches can perform as CCSEE approaches. AIMS: To explore how well CPDP approaches work as CCSEE approaches. METHOD: An empirical experiment was conducted for evaluating the performance of CPDP approaches in CCSEE. We examined 7 CPDP approaches which were selected due to the easiness of application. Those approaches were applied to 8 data sets, each of which consists of a few subsets from different domains. The estimation results were evaluated with a common performance measure called SA. RESULTS: there were several CPDP approaches which could improve the estimation accuracy though the degree of improvement was not large. CONCLUSIONS: A straight forward application of selected CPDP approaches did not bring a clear effect. CCSEE may need specific transfer learning approaches for more improvement.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130720610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Leveraging Change Intents for Characterizing and Identifying Large-Review-Effort Changes 利用变更意图来描述和识别大型评审工作变更
Song Wang, Chetan Bansal, Nachiappan Nagappan, Adithya Abraham Philip
{"title":"Leveraging Change Intents for Characterizing and Identifying Large-Review-Effort Changes","authors":"Song Wang, Chetan Bansal, Nachiappan Nagappan, Adithya Abraham Philip","doi":"10.1145/3345629.3345635","DOIUrl":"https://doi.org/10.1145/3345629.3345635","url":null,"abstract":"Code changes to software occur due to various reasons such as bug fixing, new feature addition, and code refactoring. In most existing studies, the intent of the change is rarely leveraged to provide more specific, context aware analysis. In this paper, we present the first study to leverage change intent to characterize and identify Large-Review-Effort (LRE) changes regarding review effort---changes with large review effort. Specifically, we first propose a feedback-driven and heuristics-based approach to obtain change intents. We then characterize the changes regarding review effort by using various features extracted from change metadata and the change intents. We further explore the feasibility of automatically classifying LRE changes. We conduct our study on a large-scale project from Microsoft and three large-scale open source projects, i.e., Qt, Android, and OpenStack. Our results show that, (i) code changes with some intents are more likely to be LRE changes, (ii) machine learning based prediction models can efficiently help identify LRE changes, and (iii) prediction models built for code changes with some intents achieve better performance than prediction models without considering the change intent, the improvement in AUC can be up to 19 percentage points and is 7.4 percentage points on average. The tool developed in this study has already been used in Microsoft to provide the review effort and intent information of changes for reviewers to accelerate the review process.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132827447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
From Reports to Bug-Fix Commits: A 10 Years Dataset of Bug-Fixing Activity from 55 Apache's Open Source Projects 从报告到bug修复提交:来自55个Apache开源项目的10年bug修复活动数据集
Renan Vieira, Antônio da Silva, L. Rocha, J. Gomes
{"title":"From Reports to Bug-Fix Commits: A 10 Years Dataset of Bug-Fixing Activity from 55 Apache's Open Source Projects","authors":"Renan Vieira, Antônio da Silva, L. Rocha, J. Gomes","doi":"10.1145/3345629.3345639","DOIUrl":"https://doi.org/10.1145/3345629.3345639","url":null,"abstract":"Bugs appear in almost any software development. Solving all or at least a large part of them requires a great deal of time, effort, and budget. Software projects typically use issue tracking systems as a way to report and monitor bug-fixing tasks. In recent years, several researchers have been conducting bug tracking analysis to better understand the problem and thus provide means to reduce costs and improve the efficiency of the bug-fixing task. In this paper, we introduce a new dataset composed of more than 70,000 bug-fix reports from 10 years of bug-fixing activity of 55 projects from the Apache Software Foundation, distributed in 9 categories. We have mined this information from Jira issue track system concerning two different perspectives of reports with closed/resolved status: static (the latest version of reports) and dynamic (the changes that have occurred in reports over time). We also extract information from the commits (if they exist) that fix such bugs from their respective version-control system (Git). We also provide a change analysis that occurs in the reports as a way of illustrating and characterizing the proposed dataset. Once the data extraction process is an error-prone nontrivial task, we believe such initiatives like this could be useful to support researchers in further more detailed investigations.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130126025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Does chronology matter in JIT defect prediction?: A Partial Replication Study 时间顺序在JIT缺陷预测中重要吗?:部分重复研究
H. Jahanshahi, Dhanya Jothimani, Ayse Basar, Mucahit Cevik
{"title":"Does chronology matter in JIT defect prediction?: A Partial Replication Study","authors":"H. Jahanshahi, Dhanya Jothimani, Ayse Basar, Mucahit Cevik","doi":"10.1145/3345629.3351449","DOIUrl":"https://doi.org/10.1145/3345629.3351449","url":null,"abstract":"BACKGROUND: Just-In-Time (JIT) models, unlike the traditional defect prediction models, detect the fix-inducing changes (or defect inducing changes). These models are designed based on the assumption that past code change properties are similar to future ones. However, as the system evolves, the expertise of developers and/or the complexity of the system also change. AIM: In this work, we aim to investigate the effect of code change properties on JIT models over time. We also study the impact of using recent data as well as all available data on the performance of JIT models. Further, we analyze the effect of weighted sampling on the performance of fix-inducing properties of JIT models. For this purpose, we used datasets from four open-source projects, namely Eclipse JDT, Mozilla, Eclipse Platform, and PostgreSQL. METHOD: We used five families of change code properties such as size, diffusion, history, experience, and purpose. We used Random Forest to train and test the JIT model and Brier Score (BS) and Area Under Curve (AUC) for performance measurement. We applied the Wilcoxon Signed Rank Test on the output to statistically validate whether the performance of JIT models improves using all the available data or the recent data. RESULTS: Our paper suggest that the predictive power of JIT models does not change by time. Furthermore, we observed that the chronology of data in JIT defect prediction models can be discarded by considering all the available data. On the other hand, the importance score of families of code change properties is found to oscillate over time. CONCLUSION: To mitigate the impact of the evolution of code change properties, it is recommended to use weighted sampling approach in which more emphasis is placed upon the changes occurring closer to the current time. Moreover, since properties such as \"Expertise of the Developer\" and \"Size\" evolve with the time, the models obtained from old data may exhibit different characteristics compared to those employing the newer dataset. Hence, practitioners should constantly retrain JIT models to include fresh data.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127351094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering 第十五届软件工程预测模型与数据分析国际会议论文集
{"title":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","authors":"","doi":"10.1145/3345629","DOIUrl":"https://doi.org/10.1145/3345629","url":null,"abstract":"","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115346584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Prioritizing automated user interface tests using reinforcement learning 使用强化学习对自动化用户界面测试进行优先级排序
A. Nguyen, Bach Le, Vu Nguyen
{"title":"Prioritizing automated user interface tests using reinforcement learning","authors":"A. Nguyen, Bach Le, Vu Nguyen","doi":"10.1145/3345629.3345636","DOIUrl":"https://doi.org/10.1145/3345629.3345636","url":null,"abstract":"User interface testing validates the correctness of an application through visual cues and interactive events emitted in real world usages. Performing user interface tests is a time-consuming process, and thus, many studies have focused on prioritizing test cases to help maintain the effectiveness of testing while reducing the need for a full execution. This paper describes a novel prioritization method that combines Reinforcement Learning and interaction coverage testing concepts. While Reinforcement Learning has been found to be suitable for rapid changing projects with abundant historical data, interaction coverage considers in depth the event-based aspects of user interface testing and provides a granular level at which the Reinforcement Learning system can gain more insights into individual test cases. We experiment and assess the proposed method using five data sets, finding that the method outperforms related methods and has the potential to be used in practice.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123189273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Which Refactoring Reduces Bug Rate? 哪种重构能降低Bug率?
Idan Amit, D. Feitelson
{"title":"Which Refactoring Reduces Bug Rate?","authors":"Idan Amit, D. Feitelson","doi":"10.1145/3345629.3345631","DOIUrl":"https://doi.org/10.1145/3345629.3345631","url":null,"abstract":"We present a methodology to identify refactoring operations that reduce the bug rate in the code. The methodology is based on comparing the bug fixing rate in certain time windows before and after the refactoring. We analyzed 61,331 refactor commits from 1,531 large active GitHub projects. When comparing three-month windows, the bug rate is substantially reduced in 17% of the files of analyzed refactors, compared to 12% of the files in random commits. Within this group, implementing 'todo's provides the most benefits. Certain operations like reuse, upgrade, and using enum and namespaces are also especially beneficial.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"177 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116607453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Reviewer Recommendation using Software Artifact Traceability Graphs 使用软件工件可追溯性图的审稿人建议
Emre Sülün, Eray Tüzün, Ugur Dogrusöz
{"title":"Reviewer Recommendation using Software Artifact Traceability Graphs","authors":"Emre Sülün, Eray Tüzün, Ugur Dogrusöz","doi":"10.1145/3345629.3345637","DOIUrl":"https://doi.org/10.1145/3345629.3345637","url":null,"abstract":"Various types of artifacts (requirements, source code, test cases, documents, etc.) are produced throughout the lifecycle of a software. These artifacts are often related with each other via traceability links that are stored in modern application lifecycle management repositories. Throughout the lifecycle of a software, various types of changes can arise in any one of these artifacts. It is important to review such changes to minimize their potential negative impacts. To maximize benefits of the review process, the reviewer(s) should be chosen appropriately. In this study, we reformulate the reviewer suggestion problem using software artifact traceability graphs. We introduce a novel approach, named RSTrace, to automatically recommend reviewers that are best suited based on their familiarity with a given artifact. The proposed approach, in theory, could be applied to all types of artifacts. For the purpose of this study, we focused on the source code artifact and conducted an experiment on finding the appropriate code reviewer(s). We initially tested RSTrace on an open source project and achieved top-3 recall of 0.85 with an MRR (mean reciprocal ranking) of 0.73. In a further empirical evaluation of 37 open source projects, we confirmed that the proposed reviewer recommendation approach yields promising top-k and MRR scores on the average compared to the existing reviewer recommendation approaches.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120995422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
The Technical Debt Dataset 技术债务数据集
Valentina Lenarduzzi, Nyyti Saarimäki, D. Taibi
{"title":"The Technical Debt Dataset","authors":"Valentina Lenarduzzi, Nyyti Saarimäki, D. Taibi","doi":"10.1145/3345629.3345630","DOIUrl":"https://doi.org/10.1145/3345629.3345630","url":null,"abstract":"Technical Debt analysis is increasing in popularity as nowadays researchers and industry are adopting various tools for static code analysis to evaluate the quality of their code. Despite this, empirical studies on software projects are expensive because of the time needed to analyze the projects. In addition, the results are difficult to compare as studies commonly consider different projects. In this work, we propose the Technical Debt Dataset, a curated set of project measurement data from 33 Java projects from the Apache Software Foundation. In the Technical Debt Dataset, we analyzed all commits from separately defined time frames with SonarQube to collect Technical Debt information and with Ptidej to detect code smells. Moreover, we extracted all available commit information from the git logs, the refactoring applied with Refactoring Miner, and fault information reported in the issue trackers (Jira). Using this information, we executed the SZZ algorithm to identify the fault-inducing and -fixing commits. We analyzed 78K commits from the selected 33 projects, detecting 1.8M SonarQube issues, 62K code smells, 28K faults and 57K refactorings. The project analysis took more than 200 days. In this paper, we describe the data retrieval pipeline together with the tools used for the analysis. The dataset is made available through CSV files and an SQLite database to facilitate queries on the data. The Technical Debt Dataset aims to open up diverse opportunities for Technical Debt research, enabling researchers to compare results on common projects.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116366624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信