2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)最新文献

筛选
英文 中文
The role of meta-learners in the adaptive selection of classifiers 元学习者在分类器自适应选择中的作用
D. D. Nucci, A. D. Lucia
{"title":"The role of meta-learners in the adaptive selection of classifiers","authors":"D. D. Nucci, A. D. Lucia","doi":"10.1109/MALTESQUE.2018.8368452","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2018.8368452","url":null,"abstract":"The use of machine learning techniques able to classify source code components in defective or not received a lot of attention by the research community in the last decades. Previous studies indicated that no machine learning classifier is capable of providing the best accuracy in any context, highlighting interesting complementarity among them. For these reasons ensemble methods, that combines several classifier models, have been proposed. Among these, it was proposed ASCI (Adaptive Selection of Classifiers in bug predIction), an adaptive method able to dynamically select among a set of machine learning classifiers the one that better predicts the bug proneness of a class based on its characteristics. In summary, ASCI experiments each classifier on the training set and then use a meta-learner (e.g., Random Forest) to select the most suitable classifier to use for each test set instance. In this work, we conduct an empirical investigation on 21 open source software systems with the aim of analyzing the performance of several classifiers used as meta-learner in combination with ASCI. The results show that the selection of the meta-learner has not strong influence in the results achieved by ASCI in the context of within-project bug prediction. Indeed, the use of lightweight classifiers such as Naive Bayes or Logistic Regression is suggested.","PeriodicalId":345739,"journal":{"name":"2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125318310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
User-perceived reusability estimation based on analysis of software repositories 基于软件存储库分析的用户感知的可重用性评估
Michail D. Papamichail, Themistoklis G. Diamantopoulos, Ilias Chrysovergis, Philippos Samlidis, A. Symeonidis
{"title":"User-perceived reusability estimation based on analysis of software repositories","authors":"Michail D. Papamichail, Themistoklis G. Diamantopoulos, Ilias Chrysovergis, Philippos Samlidis, A. Symeonidis","doi":"10.1109/MALTESQUE.2018.8368459","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2018.8368459","url":null,"abstract":"The popularity of open-source software repositories has led to a new reuse paradigm, where online resources can be thoroughly analyzed to identify reusable software components. Obviously, assessing the quality and specifically the reusability potential of source code residing in open software repositories poses a major challenge for the research community. Although several systems have been designed towards this direction, most of them do not focus on reusability. In this paper, we define and formulate a reusability score by employing information from GitHub stars and forks, which indicate the extent to which software components are adopted/accepted by developers. Our methodology involves applying and assessing different state-of-the-practice machine learning algorithms, in order to construct models for reusability estimation at both class and package levels. Preliminary evaluation of our methodology indicates that our approach can successfully assess reusability, as perceived by developers.","PeriodicalId":345739,"journal":{"name":"2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123020593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Ensemble techniques for software change prediction: A preliminary investigation 软件变更预测的集成技术:初步研究
Gemma Catolino, F. Ferrucci
{"title":"Ensemble techniques for software change prediction: A preliminary investigation","authors":"Gemma Catolino, F. Ferrucci","doi":"10.1109/MALTESQUE.2018.8368455","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2018.8368455","url":null,"abstract":"Predicting the classes more likely to change in the future helps developers to focus on the more critical parts of a software system, with the aim of preventively improving its maintainability. The research community has devoted a lot of effort in the definition of change prediction models, i.e., models exploiting a machine learning classifier to relate a set of independent variables to the change-proneness of classes. Besides the good performances of such models, key results of previous studies highlight how classifiers tend to perform similarly even though they are able to correctly predict the change-proneness of different code elements, possibly indicating the presence of some complementarity among them. In this paper, we aim at analyzing the extent to which ensemble methodologies, i.e., machine learning techniques able to combine multiple classifiers, can improve the performances of change-prediction models. Specifically, we empirically compared the performances of three ensemble techniques (i.e., Boosting, Random Forest, and Bagging) with those of standard machine learning classifiers (i.e., Logistic Regression and Naive Bayes). The study was conducted on eight open source systems and the results showed how ensemble techniques, in some cases, perform better than standard machine learning approaches, even if the differences among them is small. This requires the need of further research aimed at devising effective methodologies to ensemble different classifiers.","PeriodicalId":345739,"journal":{"name":"2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121962164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Varying defect prediction approaches during project evolution: A preliminary investigation 在项目发展过程中变化缺陷预测方法:初步调查
Salvatore Geremia, D. Tamburri
{"title":"Varying defect prediction approaches during project evolution: A preliminary investigation","authors":"Salvatore Geremia, D. Tamburri","doi":"10.1109/MALTESQUE.2018.8368451","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2018.8368451","url":null,"abstract":"Defect prediction approaches use various features of software product or process to prioritize testing, analysis and general quality assurance activities. Such approaches require the availability of project's historical data, making them inapplicable in early phase. To cope with this problem, researchers have proposed cross-project and even cross-company prediction models, which use training material from other projects to build the model. Despite such advances, there is limited knowledge of how, as the project evolves, it would be convenient to still keep using data from other projects, and when, instead, it might become convenient to switch towards a local prediction model. This paper empirically investigates, using historical data from four open source projects, on how the performance of various kinds of defect prediction approaches — within-project prediction, local and global cross-project prediction, and mixed (injected local cross) prediction — varies over time. Results of the study are part of a long-term investigation towards supporting the customization of defect prediction models over projects' history.","PeriodicalId":345739,"journal":{"name":"2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127023587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ConfigFile++: Automatic comment enhancement for misconfiguration prevention configfile++:自动注释增强,防止错误配置
Yuanliang Zhang, Shanshan Li, Xiangyang Xu, Xiangke Liao, Shazhou Yang, Yun Xiong
{"title":"ConfigFile++: Automatic comment enhancement for misconfiguration prevention","authors":"Yuanliang Zhang, Shanshan Li, Xiangyang Xu, Xiangke Liao, Shazhou Yang, Yun Xiong","doi":"10.1109/MALTESQUE.2018.8368457","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2018.8368457","url":null,"abstract":"Nowadays, misconfiguration has become one of the key factors leading to system problems. Most current research on the topic explores misconfiguration diagnosis, but is less concerned with educating users about how to configure correctly in order to prevent misconfiguration before it happens. In this paper, we manually study 22 open source software projects and summarize several observations on the comments of their configuration files, most of which lack sufficient information and are poorly formatted. Based on these observations and the general process of misconfiguration diagnosis, we design and implement a tool called ConfigFile++ that automatically enhances the comment in configuration files. By using name-based analysis and machine learning, ConfigFile++ extracts guiding information about the configuration option from the user manual and source code, and inserts it into the configuration files. The format of insert comment is also designed to make enhanced comments concise and clear. We use real-world examples of misconfigurations to evaluate our tool. The results show that ConfigFile++ can prevent 33 out of 50 misconfigurations.","PeriodicalId":345739,"journal":{"name":"2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125325699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Co-evolution analysis of production and test code by learning association rules of changes 通过学习变更的关联规则对生产代码和测试代码进行协同演化分析
László Vidács, M. Pinzger
{"title":"Co-evolution analysis of production and test code by learning association rules of changes","authors":"László Vidács, M. Pinzger","doi":"10.1109/MALTESQUE.2018.8368456","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2018.8368456","url":null,"abstract":"Many modern software systems come with automated tests. While these tests help to maintain code quality by providing early feedback after modifications, they also need to be maintained. In this paper, we replicate a recent pattern mining experiment to find patterns on how production and test code co-evolve over time. Understanding co-evolution patterns may directly affect the quality of tests and thus the quality of the whole system. The analysis takes into account fine grained changes in both types of code. Since the full list of fine grained changes cannot be perceived, association rules are learned from the history to extract co-change patterns. We analyzed the occurrence of 6 patterns throughout almost 2500 versions of a Java system and found that patterns are present, but supported by weaker links than in previously reported. Hence we experimented with weighting methods and investigated the composition of commits.","PeriodicalId":345739,"journal":{"name":"2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133551414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Machine learning-based run-time anomaly detection in software systems: An industrial evaluation 基于机器学习的软件系统运行时异常检测:工业评估
Fabian Huch, Mojdeh Golagha, A. Petrovska, Alexander Krauss
{"title":"Machine learning-based run-time anomaly detection in software systems: An industrial evaluation","authors":"Fabian Huch, Mojdeh Golagha, A. Petrovska, Alexander Krauss","doi":"10.1109/MALTESQUE.2018.8368453","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2018.8368453","url":null,"abstract":"Anomalies are an inevitable occurrence while operating enterprise software systems. Traditionally, anomalies are detected by threshold-based alarms for critical metrics, or health probing requests. However, fully automated detection in complex systems is challenging, since it is very difficult to distinguish truly anomalous behavior from normal operation. To this end, the traditional approaches may not be sufficient. Thus, we propose machine learning classifiers to predict the system's health status. We evaluated our approach in an industrial case study, on a large, real-world dataset of 7.5 • 106 data points for 231 features. Our results show that recurrent neural networks with long short-term memory (LSTM) are more effective in detecting anomalies and health issues, as compared to other classifiers. We achieved an area under precision-recall curve of 0.44. At the default threshold, we can automatically detect 70% of the anomalies. Despite the low precision of 31 %, the rate in which false positives occur is only 4 %.","PeriodicalId":345739,"journal":{"name":"2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127995889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
How high will it be? Using machine learning models to predict branch coverage in automated testing 它会有多高?使用机器学习模型来预测自动化测试中的分支覆盖率
Giovanni Grano, Timofey V. Titov, Sebastiano Panichella, H. Gall
{"title":"How high will it be? Using machine learning models to predict branch coverage in automated testing","authors":"Giovanni Grano, Timofey V. Titov, Sebastiano Panichella, H. Gall","doi":"10.1109/MALTESQUE.2018.8368454","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2018.8368454","url":null,"abstract":"Software testing is a crucial component in modern continuous integration development environment. Ideally, at every commit, all the system's test cases should be executed and moreover, new test cases should be generated for the new code. This is especially true in a Continuous Test Generation (CTG) environment, where the automatic generation of test cases is integrated into the continuous integration pipeline. Furthermore, developers want to achieve a minimum level of coverage for every build of their systems. Since both executing all the test cases and generating new ones for all the classes at every commit is not feasible, they have to select which subset of classes has to be tested. In this context, knowing a priori the branch coverage that can be achieved with test data generation tools might give some useful indications for answering such a question. In this paper, we take the first steps towards the definition of machine learning models to predict the branch coverage achieved by test data generation tools. We conduct a preliminary study considering well known code metrics as a features. Despite the simplicity of these features, our results show that using machine learning to predict branch coverage in automated testing is a viable and feasible option.","PeriodicalId":345739,"journal":{"name":"2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130424585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Investigating type declaration mismatches in Python 调查Python中的类型声明不匹配
L. Pascarella, Achyudh Ram, A. Nadeem, Dinesh Bisesser, Norman Knyazev, Alberto Bacchelli
{"title":"Investigating type declaration mismatches in Python","authors":"L. Pascarella, Achyudh Ram, A. Nadeem, Dinesh Bisesser, Norman Knyazev, Alberto Bacchelli","doi":"10.1109/MALTESQUE.2018.8368458","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2018.8368458","url":null,"abstract":"Past research provided evidence that developers making code changes sometimes omit to update the related documentation, thus creating inconsistencies that may contribute to faults and crashes. In dynamically typed languages, such as Python, an inconsistency in the documentation may lead to a mismatch in type declarations only visible at runtime. With our study, we investigate how often the documentation is inconsistent in a sample of 239 methods from five Python open-source software projects. Our results highlight that more than 20% of the comments are either partially defined or entirely missing and that almost 1% of the methods in the analyzed projects contain type inconsistencies. Based on these results, we create a tool, PyID, to early detect type mismatches in Python documentation and we evaluate its performance with our oracle.","PeriodicalId":345739,"journal":{"name":"2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130300574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信