基于随机森林和支持向量机的文档分类融合方法

Sheelesh Kumar Sharma, Navel Sharma, Prem Prakash Potter
{"title":"基于随机森林和支持向量机的文档分类融合方法","authors":"Sheelesh Kumar Sharma, Navel Sharma, Prem Prakash Potter","doi":"10.1109/SMART50582.2020.9337131","DOIUrl":null,"url":null,"abstract":"Document classification is an important task due to its many potential applications. With the ever-increasing number of digital documents, it has become imperative to design efficient and accurate methods for document classification. When the categories of documents are already known, the problem can be solved using supervised learning approach. System can learn the traits of a document category from the labeled data and later on can be used as a predictor for the unseen data. Here, we discuss a fusion supervised learning approach for document classification. The power of Random forests and Support Vector Machine is harnessed for making a hybrid approach for the task of document categorization. The proposed approach categories documents into their respective categories and it works well various benchmarked datasets such as 20 Newsgroups, CMU and Classic Data Sets.","PeriodicalId":129946,"journal":{"name":"2020 9th International Conference System Modeling and Advancement in Research Trends (SMART)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Fusion Approach for Document Classification using Random Forest and SVM\",\"authors\":\"Sheelesh Kumar Sharma, Navel Sharma, Prem Prakash Potter\",\"doi\":\"10.1109/SMART50582.2020.9337131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Document classification is an important task due to its many potential applications. With the ever-increasing number of digital documents, it has become imperative to design efficient and accurate methods for document classification. When the categories of documents are already known, the problem can be solved using supervised learning approach. System can learn the traits of a document category from the labeled data and later on can be used as a predictor for the unseen data. Here, we discuss a fusion supervised learning approach for document classification. The power of Random forests and Support Vector Machine is harnessed for making a hybrid approach for the task of document categorization. The proposed approach categories documents into their respective categories and it works well various benchmarked datasets such as 20 Newsgroups, CMU and Classic Data Sets.\",\"PeriodicalId\":129946,\"journal\":{\"name\":\"2020 9th International Conference System Modeling and Advancement in Research Trends (SMART)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 9th International Conference System Modeling and Advancement in Research Trends (SMART)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SMART50582.2020.9337131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 9th International Conference System Modeling and Advancement in Research Trends (SMART)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMART50582.2020.9337131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

文档分类是一项重要的任务,因为它有许多潜在的应用。随着数字文档数量的不断增加,设计高效、准确的文档分类方法已成为当务之急。当已知文档的类别时,可以使用监督学习方法来解决问题。系统可以从标记的数据中学习文档类别的特征,然后可以用作未知数据的预测器。在这里,我们讨论一种用于文档分类的融合监督学习方法。利用随机森林和支持向量机的力量,为文档分类任务提供了一种混合方法。所提出的方法将文档分类到各自的类别中,并且它可以很好地工作于各种基准数据集,如20新闻组,CMU和经典数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fusion Approach for Document Classification using Random Forest and SVM
Document classification is an important task due to its many potential applications. With the ever-increasing number of digital documents, it has become imperative to design efficient and accurate methods for document classification. When the categories of documents are already known, the problem can be solved using supervised learning approach. System can learn the traits of a document category from the labeled data and later on can be used as a predictor for the unseen data. Here, we discuss a fusion supervised learning approach for document classification. The power of Random forests and Support Vector Machine is harnessed for making a hybrid approach for the task of document categorization. The proposed approach categories documents into their respective categories and it works well various benchmarked datasets such as 20 Newsgroups, CMU and Classic Data Sets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信