{"title":"Meta-learning for data summarization based on instance selection method","authors":"K. Smith‐Miles, R. Islam","doi":"10.1109/CEC.2010.5585986","DOIUrl":null,"url":null,"abstract":"The purpose of instance selection is to identify which instances (examples, patterns) in a large dataset should be selected as representatives of the entire dataset, without significant loss of information. When a machine learning method is applied to the reduced dataset, the accuracy of the model should not be significantly worse than if the same method were applied to the entire dataset. The reducibility of any dataset, and hence the success of instance selection methods, surely depends on the characteristics of the dataset, as well as the machine learning method. This paper adopts a meta-learning approach, via an empirical study of 112 classification datasets from the UCI Repository [1], to explore the relationship between data characteristics, machine learning methods, and the success of instance selection method.","PeriodicalId":6344,"journal":{"name":"2009 IEEE Congress on Evolutionary Computation","volume":"5 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Congress on Evolutionary Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2010.5585986","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
The purpose of instance selection is to identify which instances (examples, patterns) in a large dataset should be selected as representatives of the entire dataset, without significant loss of information. When a machine learning method is applied to the reduced dataset, the accuracy of the model should not be significantly worse than if the same method were applied to the entire dataset. The reducibility of any dataset, and hence the success of instance selection methods, surely depends on the characteristics of the dataset, as well as the machine learning method. This paper adopts a meta-learning approach, via an empirical study of 112 classification datasets from the UCI Repository [1], to explore the relationship between data characteristics, machine learning methods, and the success of instance selection method.