{"title":"Advanced local feature selection in medical diagnostics","authors":"S. Puuronen, A. Tsymbal, Iryna Skrypnyk","doi":"10.1109/CBMS.2000.856868","DOIUrl":null,"url":null,"abstract":"Current electronic data repositories contain enormous amounts of data, especially in medical domains, where data is often feature-space heterogeneous, so that different features have different importance in different sub-areas of the whole space. In this paper, we suggest a technique that searches for a strategic splitting of the feature space, identifying the best subsets of features for each instance. Our technique is based on the wrapper approach, where a classification algorithm is used as the evaluation function to differentiate between several feature subsets. We apply a recently developed technique for the dynamic integration of classifiers and use decision trees. For each test instance, we consider only those feature combinations that include features that are present in the path taken by the test instance in the decision tree. We evaluate our technique on medical data sets from the UCI machine learning repository. The experiments show that local feature selection is often advantageous in comparison with feature selection on the whole space.","PeriodicalId":189930,"journal":{"name":"Proceedings 13th IEEE Symposium on Computer-Based Medical Systems. CBMS 2000","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 13th IEEE Symposium on Computer-Based Medical Systems. CBMS 2000","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMS.2000.856868","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Current electronic data repositories contain enormous amounts of data, especially in medical domains, where data is often feature-space heterogeneous, so that different features have different importance in different sub-areas of the whole space. In this paper, we suggest a technique that searches for a strategic splitting of the feature space, identifying the best subsets of features for each instance. Our technique is based on the wrapper approach, where a classification algorithm is used as the evaluation function to differentiate between several feature subsets. We apply a recently developed technique for the dynamic integration of classifiers and use decision trees. For each test instance, we consider only those feature combinations that include features that are present in the path taken by the test instance in the decision tree. We evaluate our technique on medical data sets from the UCI machine learning repository. The experiments show that local feature selection is often advantageous in comparison with feature selection on the whole space.