{"title":"特征选择,面向大数据分类的在线特征选择技术综述","authors":"S. Devi, M. Sabrigiriraj","doi":"10.1109/ICCTCT.2018.8550928","DOIUrl":null,"url":null,"abstract":"In the recent times, several disciplines have to tackle with huge datasets, which are involved with a huge number of additional features. Feature Selection (FS) techniques target at reducing the noisy, redundant, or unnecessary features, which might degrade the performance of classification. Although there is several numbers of FS techniques, still it remains an active research field among the data mining, machine learning and pattern recognition groups. Several FS techniques are imposed with critical issues with regards to efficiency and usefulness, due to rise in data dimensionality, which happens nowadays. Nonetheless, conventional techniques are deficit of sufficient scalability to deal with datasets consisting of millions of instances and obtain results with success in a less amount of time. Therefore, in this case, an Online Feature Selection (OFS) algorithm can yield a better solution for solving this issue. This work reviews few of the available and well-known FS, OFS techniques by pointing out the pros and cons of those techniques. This technical work studies the details of traditional FS and OFS techniques depending on evolutionary computation that is helpful in getting the subsets of features from huge datasets. As a result, this review also provides a summary, and analysis of machine learning algorithms for huge datasets. In addition, the new machine learning strategies and methodologies are explained with their capacity of dealing with the different challenges with the ultimate goal of assisting the practitioners in selecting the suitable solutions for their use cases. This review work renders a view on the big data domain, finds the research gaps and possibilities, and offers a solid foundation, assistance for more research in the machine learning field that uses big dataset.","PeriodicalId":344188,"journal":{"name":"2018 International Conference on Current Trends towards Converging Technologies (ICCTCT)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Feature Selection, Online Feature Selection Techniques for Big Data Classification: - A Review\",\"authors\":\"S. Devi, M. Sabrigiriraj\",\"doi\":\"10.1109/ICCTCT.2018.8550928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the recent times, several disciplines have to tackle with huge datasets, which are involved with a huge number of additional features. Feature Selection (FS) techniques target at reducing the noisy, redundant, or unnecessary features, which might degrade the performance of classification. Although there is several numbers of FS techniques, still it remains an active research field among the data mining, machine learning and pattern recognition groups. Several FS techniques are imposed with critical issues with regards to efficiency and usefulness, due to rise in data dimensionality, which happens nowadays. Nonetheless, conventional techniques are deficit of sufficient scalability to deal with datasets consisting of millions of instances and obtain results with success in a less amount of time. Therefore, in this case, an Online Feature Selection (OFS) algorithm can yield a better solution for solving this issue. This work reviews few of the available and well-known FS, OFS techniques by pointing out the pros and cons of those techniques. This technical work studies the details of traditional FS and OFS techniques depending on evolutionary computation that is helpful in getting the subsets of features from huge datasets. As a result, this review also provides a summary, and analysis of machine learning algorithms for huge datasets. In addition, the new machine learning strategies and methodologies are explained with their capacity of dealing with the different challenges with the ultimate goal of assisting the practitioners in selecting the suitable solutions for their use cases. This review work renders a view on the big data domain, finds the research gaps and possibilities, and offers a solid foundation, assistance for more research in the machine learning field that uses big dataset.\",\"PeriodicalId\":344188,\"journal\":{\"name\":\"2018 International Conference on Current Trends towards Converging Technologies (ICCTCT)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Current Trends towards Converging Technologies (ICCTCT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCTCT.2018.8550928\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Current Trends towards Converging Technologies (ICCTCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCTCT.2018.8550928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature Selection, Online Feature Selection Techniques for Big Data Classification: - A Review
In the recent times, several disciplines have to tackle with huge datasets, which are involved with a huge number of additional features. Feature Selection (FS) techniques target at reducing the noisy, redundant, or unnecessary features, which might degrade the performance of classification. Although there is several numbers of FS techniques, still it remains an active research field among the data mining, machine learning and pattern recognition groups. Several FS techniques are imposed with critical issues with regards to efficiency and usefulness, due to rise in data dimensionality, which happens nowadays. Nonetheless, conventional techniques are deficit of sufficient scalability to deal with datasets consisting of millions of instances and obtain results with success in a less amount of time. Therefore, in this case, an Online Feature Selection (OFS) algorithm can yield a better solution for solving this issue. This work reviews few of the available and well-known FS, OFS techniques by pointing out the pros and cons of those techniques. This technical work studies the details of traditional FS and OFS techniques depending on evolutionary computation that is helpful in getting the subsets of features from huge datasets. As a result, this review also provides a summary, and analysis of machine learning algorithms for huge datasets. In addition, the new machine learning strategies and methodologies are explained with their capacity of dealing with the different challenges with the ultimate goal of assisting the practitioners in selecting the suitable solutions for their use cases. This review work renders a view on the big data domain, finds the research gaps and possibilities, and offers a solid foundation, assistance for more research in the machine learning field that uses big dataset.