{"title":"Mining Open Source Software data using regular expressions","authors":"Qifeng Li, Bing Li","doi":"10.1109/CCIS.2011.6045129","DOIUrl":null,"url":null,"abstract":"The Open Source Software (OSS) management has attracted considerable attention in the last few years. Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, and data dumps may require a huge effort to understand schemas and tables. It is difficult to collect coherent, quantitative data continuously and to utilize the data for practicing software process improvement. In this paper, we report our results of mining data acquired from SourceForge.net, the largest open source software hosting website. In the process we describe Mailing list Crawler (MC) which automatically collects Mailing lists repositories in widely used software development support systems. Providing integrated measurement results graphically, MC can help developers/managers keep projects under control in real time.","PeriodicalId":128504,"journal":{"name":"2011 IEEE International Conference on Cloud Computing and Intelligence Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Cloud Computing and Intelligence Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCIS.2011.6045129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The Open Source Software (OSS) management has attracted considerable attention in the last few years. Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, and data dumps may require a huge effort to understand schemas and tables. It is difficult to collect coherent, quantitative data continuously and to utilize the data for practicing software process improvement. In this paper, we report our results of mining data acquired from SourceForge.net, the largest open source software hosting website. In the process we describe Mailing list Crawler (MC) which automatically collects Mailing lists repositories in widely used software development support systems. Providing integrated measurement results graphically, MC can help developers/managers keep projects under control in real time.