{"title":"基于对偶适应度函数的遗传算法和模糊c均值特征选择","authors":"Elmira Amiri Souri, Azadeh Mohebi, Abbas Ahmadi","doi":"10.1109/AISP.2017.8515120","DOIUrl":null,"url":null,"abstract":"Feature selection is known as an effective approach to overcome computational complexity and information redundancy in high-dimensional data classification and clustering. Selecting best features in unsupervised learning is much harder than supervised learning because we do not have the labels of data that can guide selection algorithms to remove irrelevant and redundant features. In this paper, we propose a new approach for unsupervised feature selection based on Genetic Algorithm as a heuristic search approach and combine it with Fuzzy C-Means algorithm. We propose a dual, multi objective fitness function based on Davies-Bouldin (DB) and Calinski-Harabasz (CH) indexes. We show that these indices do not necessarily have similar behaviors. Thus, rather than simply considering their weighted average as a new fitness function, we propose a new approach to aggregate them based on their tradeoffs. Comparison of the proposed approach with popular feature selection algorithms, across different datasets, indicates the outperformance of the proposed approach for feature selection.","PeriodicalId":386952,"journal":{"name":"2017 Artificial Intelligence and Signal Processing Conference (AISP)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Genetic algorithm and fuzzy C-means for feature selection: Based on a dual fitness function\",\"authors\":\"Elmira Amiri Souri, Azadeh Mohebi, Abbas Ahmadi\",\"doi\":\"10.1109/AISP.2017.8515120\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection is known as an effective approach to overcome computational complexity and information redundancy in high-dimensional data classification and clustering. Selecting best features in unsupervised learning is much harder than supervised learning because we do not have the labels of data that can guide selection algorithms to remove irrelevant and redundant features. In this paper, we propose a new approach for unsupervised feature selection based on Genetic Algorithm as a heuristic search approach and combine it with Fuzzy C-Means algorithm. We propose a dual, multi objective fitness function based on Davies-Bouldin (DB) and Calinski-Harabasz (CH) indexes. We show that these indices do not necessarily have similar behaviors. Thus, rather than simply considering their weighted average as a new fitness function, we propose a new approach to aggregate them based on their tradeoffs. Comparison of the proposed approach with popular feature selection algorithms, across different datasets, indicates the outperformance of the proposed approach for feature selection.\",\"PeriodicalId\":386952,\"journal\":{\"name\":\"2017 Artificial Intelligence and Signal Processing Conference (AISP)\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Artificial Intelligence and Signal Processing Conference (AISP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AISP.2017.8515120\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Artificial Intelligence and Signal Processing Conference (AISP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AISP.2017.8515120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Genetic algorithm and fuzzy C-means for feature selection: Based on a dual fitness function
Feature selection is known as an effective approach to overcome computational complexity and information redundancy in high-dimensional data classification and clustering. Selecting best features in unsupervised learning is much harder than supervised learning because we do not have the labels of data that can guide selection algorithms to remove irrelevant and redundant features. In this paper, we propose a new approach for unsupervised feature selection based on Genetic Algorithm as a heuristic search approach and combine it with Fuzzy C-Means algorithm. We propose a dual, multi objective fitness function based on Davies-Bouldin (DB) and Calinski-Harabasz (CH) indexes. We show that these indices do not necessarily have similar behaviors. Thus, rather than simply considering their weighted average as a new fitness function, we propose a new approach to aggregate them based on their tradeoffs. Comparison of the proposed approach with popular feature selection algorithms, across different datasets, indicates the outperformance of the proposed approach for feature selection.