Classification and Optimization Scheme for Text Data using Machine Learning Naïve Bayes Classifier

Venkatesh, K. Ranjitha
{"title":"Classification and Optimization Scheme for Text Data using Machine Learning Naïve Bayes Classifier","authors":"Venkatesh, K. Ranjitha","doi":"10.1109/WSCE.2018.8690536","DOIUrl":null,"url":null,"abstract":"Text classification is an essential advance in characteristic dialect processing. It very well may be performed utilizing different classification algorithms. Hadoop Map Reduce is widely utilized in text classification to perform classification on colossal measure of text data. However, Map Reduce required a ton of time to perform the tasks thereby increasing latency and since the data is distributed over the cluster it builds time and thus reducing processing speed. Also Hadoop utilizes long queue of code. Motivated by this, we propose a basic yet compelling machine learning method which uses Naïve Bayes classifier for text data. In Machine Learning approach, the classifier is built automatically by learning the properties of categories from a set of pre-defined training data. Hence, it can process complex furthermore, multi assortment information in dynamic situations. Here we propose a naïve bayes classifier which scales directly with number of indicators and data points which can be used for both binary and multiclass classification problems. We implemented the presented schemes using Machine Learning tool. The experimental results demonstrate the performance improvement in the classification technique.","PeriodicalId":276876,"journal":{"name":"2018 IEEE World Symposium on Communication Engineering (WSCE)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE World Symposium on Communication Engineering (WSCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSCE.2018.8690536","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

Text classification is an essential advance in characteristic dialect processing. It very well may be performed utilizing different classification algorithms. Hadoop Map Reduce is widely utilized in text classification to perform classification on colossal measure of text data. However, Map Reduce required a ton of time to perform the tasks thereby increasing latency and since the data is distributed over the cluster it builds time and thus reducing processing speed. Also Hadoop utilizes long queue of code. Motivated by this, we propose a basic yet compelling machine learning method which uses Naïve Bayes classifier for text data. In Machine Learning approach, the classifier is built automatically by learning the properties of categories from a set of pre-defined training data. Hence, it can process complex furthermore, multi assortment information in dynamic situations. Here we propose a naïve bayes classifier which scales directly with number of indicators and data points which can be used for both binary and multiclass classification problems. We implemented the presented schemes using Machine Learning tool. The experimental results demonstrate the performance improvement in the classification technique.
基于机器学习的文本数据分类与优化方案Naïve贝叶斯分类器
文本分类是特征方言处理的重要进展。它可以很好地执行使用不同的分类算法。Hadoop Map Reduce被广泛应用于文本分类中,对海量文本数据进行分类。然而,Map Reduce需要大量的时间来执行任务,从而增加了延迟,并且由于数据分布在集群上,它会增加时间,从而降低处理速度。Hadoop还利用了长队列的代码。受此启发,我们提出了一种基本但引人注目的机器学习方法,该方法使用Naïve贝叶斯分类器处理文本数据。在机器学习方法中,分类器是通过从一组预定义的训练数据中学习类别的属性来自动构建的。因此,它可以在动态情况下处理复杂的、多重的分类信息。在这里,我们提出了一个naïve贝叶斯分类器,它直接与指标和数据点的数量进行缩放,可以用于二元和多类分类问题。我们使用机器学习工具实现了所提出的方案。实验结果表明,该分类技术的性能有所提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信