{"title":"Open-Source Software Projects Curating Model for Empirical Software Engineering Studies","authors":"J. A. Carruthers","doi":"10.5753/cibse.2022.20992","DOIUrl":null,"url":null,"abstract":"Software projects are common inputs in Empirical Software Engineering (ESE), and they are often selected without following a specific strategy, leading to biased samples. To avoid this problem, researchers choose to use publicly available datasets instead of picking the projects themselves. However, some datasets are not maintained, containing old versions of projects, or even deprecated ones. This may raise some representativeness issues due to major changes in development practices and technologies over time. The main goal of this research is to develop a procedures model to construct and maintain a software project dataset with their product quality metrics, to support the development of ESE studies.","PeriodicalId":146286,"journal":{"name":"Conferencia Iberoamericana de Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conferencia Iberoamericana de Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/cibse.2022.20992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Software projects are common inputs in Empirical Software Engineering (ESE), and they are often selected without following a specific strategy, leading to biased samples. To avoid this problem, researchers choose to use publicly available datasets instead of picking the projects themselves. However, some datasets are not maintained, containing old versions of projects, or even deprecated ones. This may raise some representativeness issues due to major changes in development practices and technologies over time. The main goal of this research is to develop a procedures model to construct and maintain a software project dataset with their product quality metrics, to support the development of ESE studies.