Sebastião Santos, B. Silveira, Vinicius H. S. Durelli, R. Durelli, S. Souza, M. Delamaro
{"title":"用决策树覆盖标准测试机器学习模型","authors":"Sebastião Santos, B. Silveira, Vinicius H. S. Durelli, R. Durelli, S. Souza, M. Delamaro","doi":"10.1145/3482909.3482911","DOIUrl":null,"url":null,"abstract":"Over the past decade, there has been a growing interest in applying machine learning (ML) to address a myriad of tasks. Owing to this interest, the adoption of ML-based systems has gone mainstream. However, this widespread adoption of ML-based systems poses new challenges for software testers that must improve the quality and reliability of these ML-based solutions. To cope with the challenges of testing ML-based systems, we propose novel test adequacy criteria based on decision tree models. Differently from the traditional approach to testing ML models, which relies on manual collection and labelling of data, our criteria leverage the internal structure of decision tree models to guide the selection of test inputs. Thus, we introduce decision tree coverage (DTC) and boundary value analysis (BVA) as approaches to systematically guide the creation of effective test data that exercises key structural elements of a given decision tree model. To evaluate these criteria, we carried out an experiment using 12 datasets. We measured the effectiveness of test inputs in terms of the difference in model’s behavior between the test input and the training data. The experiment results indicate that our testing criteria can be used to guide the generation of effective test data.","PeriodicalId":355243,"journal":{"name":"Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On Using Decision Tree Coverage Criteria forTesting Machine Learning Models\",\"authors\":\"Sebastião Santos, B. Silveira, Vinicius H. S. Durelli, R. Durelli, S. Souza, M. Delamaro\",\"doi\":\"10.1145/3482909.3482911\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the past decade, there has been a growing interest in applying machine learning (ML) to address a myriad of tasks. Owing to this interest, the adoption of ML-based systems has gone mainstream. However, this widespread adoption of ML-based systems poses new challenges for software testers that must improve the quality and reliability of these ML-based solutions. To cope with the challenges of testing ML-based systems, we propose novel test adequacy criteria based on decision tree models. Differently from the traditional approach to testing ML models, which relies on manual collection and labelling of data, our criteria leverage the internal structure of decision tree models to guide the selection of test inputs. Thus, we introduce decision tree coverage (DTC) and boundary value analysis (BVA) as approaches to systematically guide the creation of effective test data that exercises key structural elements of a given decision tree model. To evaluate these criteria, we carried out an experiment using 12 datasets. We measured the effectiveness of test inputs in terms of the difference in model’s behavior between the test input and the training data. The experiment results indicate that our testing criteria can be used to guide the generation of effective test data.\",\"PeriodicalId\":355243,\"journal\":{\"name\":\"Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3482909.3482911\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3482909.3482911","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On Using Decision Tree Coverage Criteria forTesting Machine Learning Models
Over the past decade, there has been a growing interest in applying machine learning (ML) to address a myriad of tasks. Owing to this interest, the adoption of ML-based systems has gone mainstream. However, this widespread adoption of ML-based systems poses new challenges for software testers that must improve the quality and reliability of these ML-based solutions. To cope with the challenges of testing ML-based systems, we propose novel test adequacy criteria based on decision tree models. Differently from the traditional approach to testing ML models, which relies on manual collection and labelling of data, our criteria leverage the internal structure of decision tree models to guide the selection of test inputs. Thus, we introduce decision tree coverage (DTC) and boundary value analysis (BVA) as approaches to systematically guide the creation of effective test data that exercises key structural elements of a given decision tree model. To evaluate these criteria, we carried out an experiment using 12 datasets. We measured the effectiveness of test inputs in terms of the difference in model’s behavior between the test input and the training data. The experiment results indicate that our testing criteria can be used to guide the generation of effective test data.