{"title":"Incremental text categorization based on hybrid optimization-based deep belief neural network","authors":"V. Srilakshmi, K. Anuradha, C. Bindu","doi":"10.3233/JHS-210659","DOIUrl":null,"url":null,"abstract":"One of the effective text categorization methods for learning the large-scale data and the accumulated data is incremental learning. The major challenge in the incremental learning is improving the accuracy as the text document consists of numerous terms. In this research, a incremental text categorization method is developed using the proposed Spider Grasshopper Crow Optimization Algorithm based Deep Belief Neural network (SGrC-based DBN) for providing optimal text categorization results. The proposed text categorization method has four processes, such as are pre-processing, feature extraction, feature selection, text categorization, and incremental learning. Initially, the database is pre-processed and fed into vector space model for the extraction of features. Once the features are extracted, the feature selection is carried out based on mutual information. Then, the text categorization is performed using the proposed SGrC-based DBN method, which is developed by the integration of the spider monkey optimization (SMO) with the Grasshopper Crow Optimization Algorithm (GCOA) algorithm. Finally, the incremental text categorization is performed based on the hybrid weight bounding model that includes the SGrC and Range degree and particularly, the optimal weights of the Range degree model is selected based on SGrC. The experimental result of the proposed text categorization method is performed by considering the data from the Reuter database and 20 Newsgroups database. The comparative analysis of the text categorization method is based on the performance metrics, such as precision, recall and accuracy. The proposed SGrC algorithm obtained a maximum accuracy of 0.9626, maximum precision of 0.9681 and maximum recall of 0.9600, respectively when compared with the existing incremental text categorization methods.","PeriodicalId":54809,"journal":{"name":"Journal of High Speed Networks","volume":"31 1","pages":"183-202"},"PeriodicalIF":0.7000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of High Speed Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/JHS-210659","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1
Abstract
One of the effective text categorization methods for learning the large-scale data and the accumulated data is incremental learning. The major challenge in the incremental learning is improving the accuracy as the text document consists of numerous terms. In this research, a incremental text categorization method is developed using the proposed Spider Grasshopper Crow Optimization Algorithm based Deep Belief Neural network (SGrC-based DBN) for providing optimal text categorization results. The proposed text categorization method has four processes, such as are pre-processing, feature extraction, feature selection, text categorization, and incremental learning. Initially, the database is pre-processed and fed into vector space model for the extraction of features. Once the features are extracted, the feature selection is carried out based on mutual information. Then, the text categorization is performed using the proposed SGrC-based DBN method, which is developed by the integration of the spider monkey optimization (SMO) with the Grasshopper Crow Optimization Algorithm (GCOA) algorithm. Finally, the incremental text categorization is performed based on the hybrid weight bounding model that includes the SGrC and Range degree and particularly, the optimal weights of the Range degree model is selected based on SGrC. The experimental result of the proposed text categorization method is performed by considering the data from the Reuter database and 20 Newsgroups database. The comparative analysis of the text categorization method is based on the performance metrics, such as precision, recall and accuracy. The proposed SGrC algorithm obtained a maximum accuracy of 0.9626, maximum precision of 0.9681 and maximum recall of 0.9600, respectively when compared with the existing incremental text categorization methods.
期刊介绍:
The Journal of High Speed Networks is an international archival journal, active since 1992, providing a publication vehicle for covering a large number of topics of interest in the high performance networking and communication area. Its audience includes researchers, managers as well as network designers and operators. The main goal will be to provide timely dissemination of information and scientific knowledge.
The journal will publish contributed papers on novel research, survey and position papers on topics of current interest, technical notes, and short communications to report progress on long-term projects. Submissions to the Journal will be refereed consistently with the review process of leading technical journals, based on originality, significance, quality, and clarity.
The journal will publish papers on a number of topics ranging from design to practical experiences with operational high performance/speed networks.