Cem Kaya, Zeynep Hilal Kilimci, Mitat Uysal, Murat Kaya
{"title":"基于优化特征选择的文本分类候鸟","authors":"Cem Kaya, Zeynep Hilal Kilimci, Mitat Uysal, Murat Kaya","doi":"10.7717/peerj-cs.2263","DOIUrl":null,"url":null,"abstract":"Text classification tasks, particularly those involving a large number of features, pose significant challenges in effective feature selection. This research introduces a novel methodology, MBO-NB, which integrates Migrating Birds Optimization (MBO) approach with naïve Bayes as an internal classifier to address these challenges. The motivation behind this study stems from the recognized limitations of existing techniques in efficiently handling extensive feature sets. Traditional approaches often fail to adequately streamline the feature selection process, resulting in suboptimal classification accuracy and increased computational overhead. In response to this need, our primary objective is to propose a scalable and effective solution that enhances both computational efficiency and classification accuracy in text classification systems. To achieve this objective, we preprocess raw data using the Information Gain algorithm, strategically reducing the feature count from an average of 62,221 to 2,089. Through extensive experiments, we demonstrate the superior effectiveness of MBO-NB in feature reduction compared to other existing techniques, resulting in significantly improved classification accuracy. Furthermore, the successful integration of naïve Bayes within MBO offers a comprehensive and well-rounded solution to the feature selection problem. In individual comparisons with Particle Swarm Optimization (PSO), MBO-NB consistently outperforms by an average of 6.9% across four setups. This research provides valuable insights into enhancing feature selection methods, thereby contributing to the advancement of text classification techniques. By offering a scalable and effective solution, MBO-NB addresses the pressing need for improved feature selection methods in text classification, thereby facilitating the development of more robust and efficient classification systems.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"124 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Migrating birds optimization-based feature selection for text classification\",\"authors\":\"Cem Kaya, Zeynep Hilal Kilimci, Mitat Uysal, Murat Kaya\",\"doi\":\"10.7717/peerj-cs.2263\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text classification tasks, particularly those involving a large number of features, pose significant challenges in effective feature selection. This research introduces a novel methodology, MBO-NB, which integrates Migrating Birds Optimization (MBO) approach with naïve Bayes as an internal classifier to address these challenges. The motivation behind this study stems from the recognized limitations of existing techniques in efficiently handling extensive feature sets. Traditional approaches often fail to adequately streamline the feature selection process, resulting in suboptimal classification accuracy and increased computational overhead. In response to this need, our primary objective is to propose a scalable and effective solution that enhances both computational efficiency and classification accuracy in text classification systems. To achieve this objective, we preprocess raw data using the Information Gain algorithm, strategically reducing the feature count from an average of 62,221 to 2,089. Through extensive experiments, we demonstrate the superior effectiveness of MBO-NB in feature reduction compared to other existing techniques, resulting in significantly improved classification accuracy. Furthermore, the successful integration of naïve Bayes within MBO offers a comprehensive and well-rounded solution to the feature selection problem. In individual comparisons with Particle Swarm Optimization (PSO), MBO-NB consistently outperforms by an average of 6.9% across four setups. This research provides valuable insights into enhancing feature selection methods, thereby contributing to the advancement of text classification techniques. By offering a scalable and effective solution, MBO-NB addresses the pressing need for improved feature selection methods in text classification, thereby facilitating the development of more robust and efficient classification systems.\",\"PeriodicalId\":54224,\"journal\":{\"name\":\"PeerJ Computer Science\",\"volume\":\"124 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.2263\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2263","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Migrating birds optimization-based feature selection for text classification
Text classification tasks, particularly those involving a large number of features, pose significant challenges in effective feature selection. This research introduces a novel methodology, MBO-NB, which integrates Migrating Birds Optimization (MBO) approach with naïve Bayes as an internal classifier to address these challenges. The motivation behind this study stems from the recognized limitations of existing techniques in efficiently handling extensive feature sets. Traditional approaches often fail to adequately streamline the feature selection process, resulting in suboptimal classification accuracy and increased computational overhead. In response to this need, our primary objective is to propose a scalable and effective solution that enhances both computational efficiency and classification accuracy in text classification systems. To achieve this objective, we preprocess raw data using the Information Gain algorithm, strategically reducing the feature count from an average of 62,221 to 2,089. Through extensive experiments, we demonstrate the superior effectiveness of MBO-NB in feature reduction compared to other existing techniques, resulting in significantly improved classification accuracy. Furthermore, the successful integration of naïve Bayes within MBO offers a comprehensive and well-rounded solution to the feature selection problem. In individual comparisons with Particle Swarm Optimization (PSO), MBO-NB consistently outperforms by an average of 6.9% across four setups. This research provides valuable insights into enhancing feature selection methods, thereby contributing to the advancement of text classification techniques. By offering a scalable and effective solution, MBO-NB addresses the pressing need for improved feature selection methods in text classification, thereby facilitating the development of more robust and efficient classification systems.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.