{"title":"Ensemble learning with surrogate splits","authors":"M. Amasyali","doi":"10.1109/SIU.2017.7960149","DOIUrl":null,"url":null,"abstract":"Surrogate splits are used to classify test samples having missing values. In this work, they are used to produce different decisions from the same decision tree. In the popular ensemble algorithms, different sub-samples and sub-spaces are used to produce different decisions. But, in our approach, different versions of a test sample are generated by randomly deleting some features. For each version of the test sample, a different decision can be generated by using surrogate splits. 41 UCI datasets are used to compare original and surrogate split versions of the ensemble algorithms. Surrogate split versions have generally better performance than the original ones. The proposed method can be used within any ensemble algorithm using decision trees as its base learner.","PeriodicalId":217576,"journal":{"name":"2017 25th Signal Processing and Communications Applications Conference (SIU)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 25th Signal Processing and Communications Applications Conference (SIU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIU.2017.7960149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Surrogate splits are used to classify test samples having missing values. In this work, they are used to produce different decisions from the same decision tree. In the popular ensemble algorithms, different sub-samples and sub-spaces are used to produce different decisions. But, in our approach, different versions of a test sample are generated by randomly deleting some features. For each version of the test sample, a different decision can be generated by using surrogate splits. 41 UCI datasets are used to compare original and surrogate split versions of the ensemble algorithms. Surrogate split versions have generally better performance than the original ones. The proposed method can be used within any ensemble algorithm using decision trees as its base learner.