{"title":"Classification and metaclassification in large scale data mining application for estimation of software projects","authors":"D. Dzega, W. Pietruszkiewicz","doi":"10.1109/UKRICIS.2010.5898136","DOIUrl":null,"url":null,"abstract":"In this article we present an application of Artificial Intelligence for estimation of software projects. The research presented herein was based on several methods of classification and metaclassification. Due to increasing significance of Open Source, we have selected projects being hosted on the leading platform for Open Source projects — Sourceforge.net. In the first part of article, we describe steps of data extraction which was a large scale task because the datasource contained tens of tables and hundreds of fields, that were originally gathered to be used by project management web-based system. Therefore extraction of meaningful data required analysis of databases structure and transformation of sets of records into a four datasets. These datasets were used to predict four factors important to project management i.e skills, time, costs an effectiveness. Later, we present the results of experiments, that were performed using C4.5, RandomTree and CART algorithms. In the final part of this article, we describe how boosting and bagging metaclassifiers were applied to improve the results and we also analyse influence of their parameters on generalization abilities an prediction accuracy.","PeriodicalId":359942,"journal":{"name":"2010 IEEE 9th International Conference on Cyberntic Intelligent Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 9th International Conference on Cyberntic Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UKRICIS.2010.5898136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
In this article we present an application of Artificial Intelligence for estimation of software projects. The research presented herein was based on several methods of classification and metaclassification. Due to increasing significance of Open Source, we have selected projects being hosted on the leading platform for Open Source projects — Sourceforge.net. In the first part of article, we describe steps of data extraction which was a large scale task because the datasource contained tens of tables and hundreds of fields, that were originally gathered to be used by project management web-based system. Therefore extraction of meaningful data required analysis of databases structure and transformation of sets of records into a four datasets. These datasets were used to predict four factors important to project management i.e skills, time, costs an effectiveness. Later, we present the results of experiments, that were performed using C4.5, RandomTree and CART algorithms. In the final part of this article, we describe how boosting and bagging metaclassifiers were applied to improve the results and we also analyse influence of their parameters on generalization abilities an prediction accuracy.