Sombut Foithong, P. Srinil, Kattiya T. Yangyuen, Thanawat Phattaraworamet
{"title":"Rough-mutual feature selection based-on minimal-boundary and maximal-lower","authors":"Sombut Foithong, P. Srinil, Kattiya T. Yangyuen, Thanawat Phattaraworamet","doi":"10.1109/MITICON.2016.8025230","DOIUrl":null,"url":null,"abstract":"Feature selection (FS) is an important preprocessing step for many applications in Data Mining. Most existing FS methods based on rough set theory focus on dependency function, which is based on lower approximation as for measuring the goodness of the feature subset. However, by determining only information from a positive region but neglecting a boundary region, mostly relevant information could be invisible. This paper, the minimal boundary region — maximal lower approximation (mBML) criterion, focuses on feature selection methods based on rough set and mutual information (MI) which use the different values among the lower approximation information and the information contained in the boundary region. The use of this criterion can result in higher predictive accuracy than those obtained using the measure based on the positive region alone. Experimental results are illustrated for crisp and real-valued data and compared with other FS methods in terms of subset size, runtime, and classification accuracy.","PeriodicalId":127868,"journal":{"name":"2016 Management and Innovation Technology International Conference (MITicon)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Management and Innovation Technology International Conference (MITicon)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MITICON.2016.8025230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Feature selection (FS) is an important preprocessing step for many applications in Data Mining. Most existing FS methods based on rough set theory focus on dependency function, which is based on lower approximation as for measuring the goodness of the feature subset. However, by determining only information from a positive region but neglecting a boundary region, mostly relevant information could be invisible. This paper, the minimal boundary region — maximal lower approximation (mBML) criterion, focuses on feature selection methods based on rough set and mutual information (MI) which use the different values among the lower approximation information and the information contained in the boundary region. The use of this criterion can result in higher predictive accuracy than those obtained using the measure based on the positive region alone. Experimental results are illustrated for crisp and real-valued data and compared with other FS methods in terms of subset size, runtime, and classification accuracy.