2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)最新文献

Using machine learning to design a flexible LOC counter 利用机器学习设计一个灵活的LOC计数器

2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) Pub Date : 2017-02-21 DOI: 10.1109/MALTESQUE.2017.7882011

Miroslaw Ochodek, M. Staron, Dominik Bargowski, Wilhelm Meding, R. Hebig

{"title":"Using machine learning to design a flexible LOC counter","authors":"Miroslaw Ochodek, M. Staron, Dominik Bargowski, Wilhelm Meding, R. Hebig","doi":"10.1109/MALTESQUE.2017.7882011","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2017.7882011","url":null,"abstract":"The results of counting the size of programs in terms of Lines-of-Code (LOC) depends on the rules used for counting (i.e. definition of which lines should be counted). In the majority of the measurement tools, the rules are statically coded in the tool and the users of the measurement tools do not know which lines were counted and which were not. The goal of our research is to investigate how to use machine learning to teach a measurement tool which lines should be counted and which should not. Our interest is to identify which parameters of the learning algorithm can be used to classify lines to be counted. Our research is based on the design science research methodology where we construct a measurement tool based on machine learning and evaluate it based on open source programs. As a training set, we use industry professionals to classify which lines should be counted. The results show that classifying the lines as to be counted or not has an average accuracy varying between 0.90 and 0.99 measured as Matthew's Correlation Coefficient and between 95% and nearly 100% measured as the percentage of correctly classified lines. Based on the results we conclude that using machine learning algorithms as the core of modern measurement instruments has a large potential and should be explored further.","PeriodicalId":153927,"journal":{"name":"2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114414356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Investigating code smell co-occurrences using association rule learning: A replicated study 使用关联规则学习调查代码气味的共同出现:一项重复研究

2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) Pub Date : 2017-02-21 DOI: 10.1109/MALTESQUE.2017.7882010

Fabio Palomba, R. Oliveto, A. D. Lucia

{"title":"Investigating code smell co-occurrences using association rule learning: A replicated study","authors":"Fabio Palomba, R. Oliveto, A. D. Lucia","doi":"10.1109/MALTESQUE.2017.7882010","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2017.7882010","url":null,"abstract":"Previous research demonstrated how code smells (i.e., symptoms of the presence of poor design or implementation choices) threat software maintainability. Moreover, some studies showed that their interaction has a stronger negative impact on the ability of developers to comprehend and enhance the source code when compared to cases when a single code smell instance affects a code element (i.e., a class or a method). While such studies analyzed the effect of the co-presence of more smells from the developers' perspective, a little knowledge regarding which code smell types tend to co-occur in the source code is currently available. Indeed, previous papers on smell co-occurrence have been conducted on a small number of code smell types or on small datasets, thus possibly missing important relationships. To corroborate and possibly enlarge the knowledge on the phenomenon, in this paper we provide a large-scale replication of previous studies, taking into account 13 code smell types on a dataset composed of 395 releases of 30 software systems. Code smell co-occurrences have been captured by using association rule mining, an unsupervised learning technique able to discover frequent relationships in a dataset. The results highlighted some expected relationships, but also shed light on co-occurrences missed by previous research in the field.","PeriodicalId":153927,"journal":{"name":"2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114522929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Automatic feature selection by regularization to improve bug prediction accuracy 通过正则化自动特征选择，提高bug预测精度

2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) Pub Date : 2017-02-21 DOI: 10.1109/MALTESQUE.2017.7882013

Haidar Osman, Mohammad Ghafari, Oscar Nierstrasz

{"title":"Automatic feature selection by regularization to improve bug prediction accuracy","authors":"Haidar Osman, Mohammad Ghafari, Oscar Nierstrasz","doi":"10.1109/MALTESQUE.2017.7882013","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2017.7882013","url":null,"abstract":"Bug prediction has been a hot research topic for the past two decades, during which different machine learning models based on a variety of software metrics have been proposed. Feature selection is a technique that removes noisy and redundant features to improve the accuracy and generalizability of a prediction model. Although feature selection is important, it adds yet another step to the process of building a bug prediction model and increases its complexity. Recent advances in machine learning introduce embedded feature selection methods that allow a prediction model to carry out feature selection automatically as part of the training process. The effect of these methods on bug prediction is unknown. In this paper we study regularization as an embedded feature selection method in bug prediction models. Specifically, we study the impact of three regularization methods (Ridge, Lasso, and ElasticNet) on linear and Poisson Regression as bug predictors for five open source Java systems. Our results show that the three regularization methods reduce the prediction error of the regressors and improve their stability.","PeriodicalId":153927,"journal":{"name":"2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121326968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Hyperparameter optimization to improve bug prediction accuracy 超参数优化，提高bug预测精度

2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) Pub Date : 2017-02-21 DOI: 10.1109/MALTESQUE.2017.7882014

Haidar Osman, Mohammad Ghafari, Oscar Nierstrasz

{"title":"Hyperparameter optimization to improve bug prediction accuracy","authors":"Haidar Osman, Mohammad Ghafari, Oscar Nierstrasz","doi":"10.1109/MALTESQUE.2017.7882014","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2017.7882014","url":null,"abstract":"Bug prediction is a technique that strives to identify where defects will appear in a software system. Bug prediction employs machine learning to predict defects in software entities based on software metrics. These machine learning models usually have adjustable parameters, called hyperparameters, that need to be tuned for the prediction problem at hand. However, most studies in the literature keep the model hyperparameters set to the default values provided by the used machine learning frameworks. In this paper we investigate whether optimizing the hyperparameters of a machine learning model improves its prediction power. We study two machine learning algorithms: k-nearest neighbours (IBK) and support vector machines (SVM). We carry out experiments on five open source Java systems. Our results show that (i) models differ in their sensitivity to their hyperparameters, (ii) tuning hyperparameters gives at least as accurate models for SVM and significantly more accurate models for IBK, and (iii) most of the default values are changed during the tuning phase. Based on these findings we recommend tuning hyperparameters as a necessary step before using a machine learning model in bug prediction.","PeriodicalId":153927,"journal":{"name":"2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116991675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Machine learning for finding bugs: An initial report 用于查找bug的机器学习:一份初始报告

2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) Pub Date : 2017-02-21 DOI: 10.1109/MALTESQUE.2017.7882012

Timothy Chappell, C. Cifuentes, P. Krishnan, S. Geva

{"title":"Machine learning for finding bugs: An initial report","authors":"Timothy Chappell, C. Cifuentes, P. Krishnan, S. Geva","doi":"10.1109/MALTESQUE.2017.7882012","DOIUrl":"https://doi.org/10.1109/MALTESQUE.2017.7882012","url":null,"abstract":"Static program analysis is a technique to analyse code without executing it, and can be used to find bugs in source code. Many open source and commercial tools have been developed in this space over the past 20 years. Scalability and precision are of importance for the deployment of static code analysis tools - numerous false positives and slow runtime both make the tool hard to be used by development, where integration into a nightly build is the standard goal. This requires one to identify a suitable abstraction for the static analysis which is typically a manual process and can be expensive. In this paper we report our findings on using machine learning techniques to detect defects in C programs. We use three offthe- shelf machine learning techniques and use a large corpus of programs available for use in both the training and evaluation of the results. We compare the results produced by the machine learning technique against the Parfait static program analysis tool used internally at Oracle by thousands of developers. While on the surface the initial results were encouraging, further investigation suggests that the machine learning techniques we used are not suitable replacements for static program analysis tools due to low precision of the results. This could be due to a variety of reasons including not using domain knowledge such as the semantics of the programming language and lack of suitable data used in the training process.","PeriodicalId":153927,"journal":{"name":"2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116689877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Using source code metrics to predict change-prone web services: A case-study on ebay services 使用源代码度量来预测易变的web服务:ebay服务的案例研究

2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) Pub Date : 2017-02-21 DOI: 10.1109/MALTESQUE.2017.7882009

L. Kumar, S. K. Rath, A. Sureka

引用次数: 19