基于支持向量机和特征选择的税务评论文本挖掘大数据分析

2018 International Conference on Information and Communications Technology (ICOIACT) Pub Date : 2018-03-01 DOI:10.1109/ICOIACT.2018.8350743

Mihuandayani, Ema Utami, E. T. Luthfi

{"title":"基于支持向量机和特征选择的税务评论文本挖掘大数据分析","authors":"Mihuandayani, Ema Utami, E. T. Luthfi","doi":"10.1109/ICOIACT.2018.8350743","DOIUrl":null,"url":null,"abstract":"The tax gives an important role for the contributions of the economy and development of a country. The improvements to the taxation service system continuously done in order to increase the State Budget. One of consideration to know the performance of taxation particularly in Indonesia is to know the public opinion as for the object service. Text mining can be used to know public opinion about the tax system. The rapid growth of data in social media initiates this research to use the data source as big data analysis. The dataset used is derived from Facebook and Twitter as a source of data in processing tax comments. The results of opinions in the form of public sentiment in part of service, website system, and news can be used as consideration to improve the quality of tax services. In this research, text mining is done through the phases of text processing, feature selection and classification with Support Vector Machine (SVM). To reduce the problem of the number of attributes on the dataset in classifying text, Feature Selection used the Information Gain to select the relevant terms to the tax topic. Testing is used to measure the performance level of SVM with Feature Selection from two data sources. Performance measured using the parameters of precision, recall, and F-measure.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"25 1","pages":"537-542"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Text mining based on tax comments as big data analysis using SVM and feature selection\",\"authors\":\"Mihuandayani, Ema Utami, E. T. Luthfi\",\"doi\":\"10.1109/ICOIACT.2018.8350743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The tax gives an important role for the contributions of the economy and development of a country. The improvements to the taxation service system continuously done in order to increase the State Budget. One of consideration to know the performance of taxation particularly in Indonesia is to know the public opinion as for the object service. Text mining can be used to know public opinion about the tax system. The rapid growth of data in social media initiates this research to use the data source as big data analysis. The dataset used is derived from Facebook and Twitter as a source of data in processing tax comments. The results of opinions in the form of public sentiment in part of service, website system, and news can be used as consideration to improve the quality of tax services. In this research, text mining is done through the phases of text processing, feature selection and classification with Support Vector Machine (SVM). To reduce the problem of the number of attributes on the dataset in classifying text, Feature Selection used the Information Gain to select the relevant terms to the tax topic. Testing is used to measure the performance level of SVM with Feature Selection from two data sources. Performance measured using the parameters of precision, recall, and F-measure.\",\"PeriodicalId\":6660,\"journal\":{\"name\":\"2018 International Conference on Information and Communications Technology (ICOIACT)\",\"volume\":\"25 1\",\"pages\":\"537-542\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Information and Communications Technology (ICOIACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOIACT.2018.8350743\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Information and Communications Technology (ICOIACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOIACT.2018.8350743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

税收对于一个国家的经济和发展的贡献起着重要作用。不断完善税务服务体系，增加国家预算。要了解税收的执行情况，特别是在印度尼西亚，需要考虑的一个因素是了解公众对客体服务的看法。文本挖掘可以用来了解公众对税收制度的看法。社交媒体中数据的快速增长促使本研究将数据来源作为大数据分析。所使用的数据集来自Facebook和Twitter，作为处理税务评论的数据来源。部分服务、网站系统、新闻等方面的舆情形式的意见结果，可以作为提高税务服务质量的考虑因素。在本研究中，文本挖掘通过文本处理、特征选择和支持向量机(SVM)分类三个阶段完成。为了减少文本分类中数据集属性数量的问题，Feature Selection使用信息增益来选择与税务主题相关的术语。采用测试的方法，通过两个数据源的特征选择来衡量支持向量机的性能水平。使用精度、召回率和F-measure参数测量性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Text mining based on tax comments as big data analysis using SVM and feature selection

The tax gives an important role for the contributions of the economy and development of a country. The improvements to the taxation service system continuously done in order to increase the State Budget. One of consideration to know the performance of taxation particularly in Indonesia is to know the public opinion as for the object service. Text mining can be used to know public opinion about the tax system. The rapid growth of data in social media initiates this research to use the data source as big data analysis. The dataset used is derived from Facebook and Twitter as a source of data in processing tax comments. The results of opinions in the form of public sentiment in part of service, website system, and news can be used as consideration to improve the quality of tax services. In this research, text mining is done through the phases of text processing, feature selection and classification with Support Vector Machine (SVM). To reduce the problem of the number of attributes on the dataset in classifying text, Feature Selection used the Information Gain to select the relevant terms to the tax topic. Testing is used to measure the performance level of SVM with Feature Selection from two data sources. Performance measured using the parameters of precision, recall, and F-measure.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 International Conference on Information and Communications Technology (ICOIACT)

自引率

0.00%

发文量