{"title":"使用生存分析模型挖掘软件代码库和bug数据库","authors":"M. Wedel, U. Jensen, P. Göhner","doi":"10.1145/1414004.1414052","DOIUrl":null,"url":null,"abstract":"Code repositories and bug databases contain valuable information about the process of software development. Typical studies correlate code properties with the number of faults in a software module to find error-prone modules. However, many studies do not regard the occurrence of faults over time, although the time information can be retrieved from bug databases. In order to overcome this problem, we suggest the application of survival analysis models, which are used in biostatistics and can handle time-dependent data. Because a large amount of raw data has to be evaluated statistically, we further discuss the automated retrieval and pre-processing of raw data from code repositories and bug databases.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Mining software code repositories and bug databases using survival analysis models\",\"authors\":\"M. Wedel, U. Jensen, P. Göhner\",\"doi\":\"10.1145/1414004.1414052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code repositories and bug databases contain valuable information about the process of software development. Typical studies correlate code properties with the number of faults in a software module to find error-prone modules. However, many studies do not regard the occurrence of faults over time, although the time information can be retrieved from bug databases. In order to overcome this problem, we suggest the application of survival analysis models, which are used in biostatistics and can handle time-dependent data. Because a large amount of raw data has to be evaluated statistically, we further discuss the automated retrieval and pre-processing of raw data from code repositories and bug databases.\",\"PeriodicalId\":124452,\"journal\":{\"name\":\"International Symposium on Empirical Software Engineering and Measurement\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Symposium on Empirical Software Engineering and Measurement\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1414004.1414052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1414004.1414052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mining software code repositories and bug databases using survival analysis models
Code repositories and bug databases contain valuable information about the process of software development. Typical studies correlate code properties with the number of faults in a software module to find error-prone modules. However, many studies do not regard the occurrence of faults over time, although the time information can be retrieved from bug databases. In order to overcome this problem, we suggest the application of survival analysis models, which are used in biostatistics and can handle time-dependent data. Because a large amount of raw data has to be evaluated statistically, we further discuss the automated retrieval and pre-processing of raw data from code repositories and bug databases.