{"title":"Towards Effective Unbiased Automated Feature Selection","authors":"K. Iswandy, A. König","doi":"10.1109/HIS.2006.72","DOIUrl":null,"url":null,"abstract":"The selection of relevant and non-redundant features or variables from a larger set is an ubiquitous problem in many disciplines. Numerous automated methods have been introduced, however, the important issue of selection stability is still largely uncovered. It can be observed, that small changes in the data can lead to dramatic changes in the selection. This compromises both statistical reliability and recognition rates as well as knowledge extraction. In our work, we pursue an approach employing data sampling techniques, e.g., leave-one-out method, and generate statistics of selection to determine a stability factor and identify stable features. In this paper, we introduce improved selection techniques from first and second order statistics and demonstrate their effectiveness for three benchmark problems of increasing complexity.","PeriodicalId":150732,"journal":{"name":"2006 Sixth International Conference on Hybrid Intelligent Systems (HIS'06)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 Sixth International Conference on Hybrid Intelligent Systems (HIS'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HIS.2006.72","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
The selection of relevant and non-redundant features or variables from a larger set is an ubiquitous problem in many disciplines. Numerous automated methods have been introduced, however, the important issue of selection stability is still largely uncovered. It can be observed, that small changes in the data can lead to dramatic changes in the selection. This compromises both statistical reliability and recognition rates as well as knowledge extraction. In our work, we pursue an approach employing data sampling techniques, e.g., leave-one-out method, and generate statistics of selection to determine a stability factor and identify stable features. In this paper, we introduce improved selection techniques from first and second order statistics and demonstrate their effectiveness for three benchmark problems of increasing complexity.