基于支持向量机和特征选择的税务评论文本挖掘大数据分析

Mihuandayani, Ema Utami, E. T. Luthfi
{"title":"基于支持向量机和特征选择的税务评论文本挖掘大数据分析","authors":"Mihuandayani, Ema Utami, E. T. Luthfi","doi":"10.1109/ICOIACT.2018.8350743","DOIUrl":null,"url":null,"abstract":"The tax gives an important role for the contributions of the economy and development of a country. The improvements to the taxation service system continuously done in order to increase the State Budget. One of consideration to know the performance of taxation particularly in Indonesia is to know the public opinion as for the object service. Text mining can be used to know public opinion about the tax system. The rapid growth of data in social media initiates this research to use the data source as big data analysis. The dataset used is derived from Facebook and Twitter as a source of data in processing tax comments. The results of opinions in the form of public sentiment in part of service, website system, and news can be used as consideration to improve the quality of tax services. In this research, text mining is done through the phases of text processing, feature selection and classification with Support Vector Machine (SVM). To reduce the problem of the number of attributes on the dataset in classifying text, Feature Selection used the Information Gain to select the relevant terms to the tax topic. Testing is used to measure the performance level of SVM with Feature Selection from two data sources. Performance measured using the parameters of precision, recall, and F-measure.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"25 1","pages":"537-542"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Text mining based on tax comments as big data analysis using SVM and feature selection\",\"authors\":\"Mihuandayani, Ema Utami, E. T. Luthfi\",\"doi\":\"10.1109/ICOIACT.2018.8350743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The tax gives an important role for the contributions of the economy and development of a country. The improvements to the taxation service system continuously done in order to increase the State Budget. One of consideration to know the performance of taxation particularly in Indonesia is to know the public opinion as for the object service. Text mining can be used to know public opinion about the tax system. The rapid growth of data in social media initiates this research to use the data source as big data analysis. The dataset used is derived from Facebook and Twitter as a source of data in processing tax comments. The results of opinions in the form of public sentiment in part of service, website system, and news can be used as consideration to improve the quality of tax services. In this research, text mining is done through the phases of text processing, feature selection and classification with Support Vector Machine (SVM). To reduce the problem of the number of attributes on the dataset in classifying text, Feature Selection used the Information Gain to select the relevant terms to the tax topic. Testing is used to measure the performance level of SVM with Feature Selection from two data sources. Performance measured using the parameters of precision, recall, and F-measure.\",\"PeriodicalId\":6660,\"journal\":{\"name\":\"2018 International Conference on Information and Communications Technology (ICOIACT)\",\"volume\":\"25 1\",\"pages\":\"537-542\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Information and Communications Technology (ICOIACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOIACT.2018.8350743\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Information and Communications Technology (ICOIACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOIACT.2018.8350743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

税收对于一个国家的经济和发展的贡献起着重要作用。不断完善税务服务体系,增加国家预算。要了解税收的执行情况,特别是在印度尼西亚,需要考虑的一个因素是了解公众对客体服务的看法。文本挖掘可以用来了解公众对税收制度的看法。社交媒体中数据的快速增长促使本研究将数据来源作为大数据分析。所使用的数据集来自Facebook和Twitter,作为处理税务评论的数据来源。部分服务、网站系统、新闻等方面的舆情形式的意见结果,可以作为提高税务服务质量的考虑因素。在本研究中,文本挖掘通过文本处理、特征选择和支持向量机(SVM)分类三个阶段完成。为了减少文本分类中数据集属性数量的问题,Feature Selection使用信息增益来选择与税务主题相关的术语。采用测试的方法,通过两个数据源的特征选择来衡量支持向量机的性能水平。使用精度、召回率和F-measure参数测量性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Text mining based on tax comments as big data analysis using SVM and feature selection
The tax gives an important role for the contributions of the economy and development of a country. The improvements to the taxation service system continuously done in order to increase the State Budget. One of consideration to know the performance of taxation particularly in Indonesia is to know the public opinion as for the object service. Text mining can be used to know public opinion about the tax system. The rapid growth of data in social media initiates this research to use the data source as big data analysis. The dataset used is derived from Facebook and Twitter as a source of data in processing tax comments. The results of opinions in the form of public sentiment in part of service, website system, and news can be used as consideration to improve the quality of tax services. In this research, text mining is done through the phases of text processing, feature selection and classification with Support Vector Machine (SVM). To reduce the problem of the number of attributes on the dataset in classifying text, Feature Selection used the Information Gain to select the relevant terms to the tax topic. Testing is used to measure the performance level of SVM with Feature Selection from two data sources. Performance measured using the parameters of precision, recall, and F-measure.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信