{"title":"结合合成少数派过采样技术和子集特征选择技术的类不平衡问题","authors":"Pawan Lachheta, S. Bawa","doi":"10.1145/2979779.2979804","DOIUrl":null,"url":null,"abstract":"Building an effective classification model when the high dimensional data is suffering from class imbalance problem is a major challenge. The problem is severe when negative samples have large percentages than positive samples. To surmount the class imbalance and high dimensionality issues in the dataset, we propose a SFS framework that comprises of SMOTE filters, which are used for balancing the datasets, as well as feature ranker for pre-processing of data. The framework is developed using R language and various R packages. Then the performance of SFS framework is evaluated and found that proposed framework outperforms than other state-of-the-art methods.","PeriodicalId":298730,"journal":{"name":"Proceedings of the International Conference on Advances in Information Communication Technology & Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Combining Synthetic Minority Oversampling Technique and Subset Feature Selection Technique For Class Imbalance Problem\",\"authors\":\"Pawan Lachheta, S. Bawa\",\"doi\":\"10.1145/2979779.2979804\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Building an effective classification model when the high dimensional data is suffering from class imbalance problem is a major challenge. The problem is severe when negative samples have large percentages than positive samples. To surmount the class imbalance and high dimensionality issues in the dataset, we propose a SFS framework that comprises of SMOTE filters, which are used for balancing the datasets, as well as feature ranker for pre-processing of data. The framework is developed using R language and various R packages. Then the performance of SFS framework is evaluated and found that proposed framework outperforms than other state-of-the-art methods.\",\"PeriodicalId\":298730,\"journal\":{\"name\":\"Proceedings of the International Conference on Advances in Information Communication Technology & Computing\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Conference on Advances in Information Communication Technology & Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2979779.2979804\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on Advances in Information Communication Technology & Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2979779.2979804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Combining Synthetic Minority Oversampling Technique and Subset Feature Selection Technique For Class Imbalance Problem
Building an effective classification model when the high dimensional data is suffering from class imbalance problem is a major challenge. The problem is severe when negative samples have large percentages than positive samples. To surmount the class imbalance and high dimensionality issues in the dataset, we propose a SFS framework that comprises of SMOTE filters, which are used for balancing the datasets, as well as feature ranker for pre-processing of data. The framework is developed using R language and various R packages. Then the performance of SFS framework is evaluated and found that proposed framework outperforms than other state-of-the-art methods.