{"title":"跨公司缺陷预测的最近邻抽样:只是抽象的","authors":"Burak Turhan, A. Bener, T. Menzies","doi":"10.1145/1390817.1390824","DOIUrl":null,"url":null,"abstract":"Several research in defect prediction focus on building models with available local data (i.e. within company predictors). To employ these models, a company should have a data repository, where project metrics and defect information from past projects are stored. However, few companies apply this practice. In a recent work, we have shown that cross company data can be used for building predictors with the cost of increased false alarms. Thus, we argued that the practical application of cross-company predictors is limited to mission critical projects and companies should starve for local data. In this paper, we show that nearest neighbor (NN) sampling of cross-company data removes the increased false alarm rates. We conclude that cross company defect predictors can be practical tools with NN sampling, yet local predictors are still the best and companies should keep starving for local data.","PeriodicalId":193694,"journal":{"name":"DEFECTS '08","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Nearest neighbor sampling for cross company defect predictors: abstract only\",\"authors\":\"Burak Turhan, A. Bener, T. Menzies\",\"doi\":\"10.1145/1390817.1390824\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Several research in defect prediction focus on building models with available local data (i.e. within company predictors). To employ these models, a company should have a data repository, where project metrics and defect information from past projects are stored. However, few companies apply this practice. In a recent work, we have shown that cross company data can be used for building predictors with the cost of increased false alarms. Thus, we argued that the practical application of cross-company predictors is limited to mission critical projects and companies should starve for local data. In this paper, we show that nearest neighbor (NN) sampling of cross-company data removes the increased false alarm rates. We conclude that cross company defect predictors can be practical tools with NN sampling, yet local predictors are still the best and companies should keep starving for local data.\",\"PeriodicalId\":193694,\"journal\":{\"name\":\"DEFECTS '08\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-07-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"DEFECTS '08\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1390817.1390824\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"DEFECTS '08","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1390817.1390824","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Nearest neighbor sampling for cross company defect predictors: abstract only
Several research in defect prediction focus on building models with available local data (i.e. within company predictors). To employ these models, a company should have a data repository, where project metrics and defect information from past projects are stored. However, few companies apply this practice. In a recent work, we have shown that cross company data can be used for building predictors with the cost of increased false alarms. Thus, we argued that the practical application of cross-company predictors is limited to mission critical projects and companies should starve for local data. In this paper, we show that nearest neighbor (NN) sampling of cross-company data removes the increased false alarm rates. We conclude that cross company defect predictors can be practical tools with NN sampling, yet local predictors are still the best and companies should keep starving for local data.