印尼Twitter文本权威分类在万隆政府

2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA) Pub Date : 2014-08-01 DOI:10.1109/ICAICTA.2014.7005928

Janice Laksana, A. Purwarianti

{"title":"印尼Twitter文本权威分类在万隆政府","authors":"Janice Laksana, A. Purwarianti","doi":"10.1109/ICAICTA.2014.7005928","DOIUrl":null,"url":null,"abstract":"Nowadays, social media based complaint management systems have been deployed in several countries and cities including Bandung. We proposed an automatic authority classification for Twitter text in Indonesian as part of the complaint management system. Our analysis showed that there are several Twitter message types raised in official account Twitter of the city government. The classification employed a statistical based multi-label text classification. Here, we compared several techniques in the classification such as the features, the algorithms and the classification schemes. In the features comparison, we examined several features such as the complaint word feature, n-gram feature, and the @username feature. In the algorithms comparison, we employed Decision Tree algorithm, Naïve Bayes algorithm, and Support Vector Machine algorithm with multi-label classification techniques of Binary Relevance and Label Power Set. In the complaint classification schemes, we compared the direct classification and two steps classification. Using 2244 twitter texts from twitter of Bandung city government and 5-fold cross validation, the best experimental result of 70.90% accuracy was achieved by the feature combination of 1-gram and complaint word, with Support Vector Machine and Label Power Set as the algorithm, in the direct scheme of text classification.","PeriodicalId":173600,"journal":{"name":"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Indonesian Twitter text authority classification for government in Bandung\",\"authors\":\"Janice Laksana, A. Purwarianti\",\"doi\":\"10.1109/ICAICTA.2014.7005928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, social media based complaint management systems have been deployed in several countries and cities including Bandung. We proposed an automatic authority classification for Twitter text in Indonesian as part of the complaint management system. Our analysis showed that there are several Twitter message types raised in official account Twitter of the city government. The classification employed a statistical based multi-label text classification. Here, we compared several techniques in the classification such as the features, the algorithms and the classification schemes. In the features comparison, we examined several features such as the complaint word feature, n-gram feature, and the @username feature. In the algorithms comparison, we employed Decision Tree algorithm, Naïve Bayes algorithm, and Support Vector Machine algorithm with multi-label classification techniques of Binary Relevance and Label Power Set. In the complaint classification schemes, we compared the direct classification and two steps classification. Using 2244 twitter texts from twitter of Bandung city government and 5-fold cross validation, the best experimental result of 70.90% accuracy was achieved by the feature combination of 1-gram and complaint word, with Support Vector Machine and Label Power Set as the algorithm, in the direct scheme of text classification.\",\"PeriodicalId\":173600,\"journal\":{\"name\":\"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAICTA.2014.7005928\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICTA.2014.7005928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

如今，基于社交媒体的投诉管理系统已在包括万隆在内的几个国家和城市部署。我们建议对印度尼西亚语的Twitter文本进行自动权限分类，作为投诉管理系统的一部分。我们的分析表明，在市政府公众号Twitter中提出了几种Twitter消息类型。分类采用基于统计的多标签文本分类。本文从特征、算法和分类方案等方面对几种分类技术进行了比较。在功能比较中，我们研究了几个功能，如投诉词功能、n-gram功能和@username功能。在算法比较中，我们采用了决策树算法、Naïve贝叶斯算法和支持向量机算法，并结合了二值相关和标签功率集的多标签分类技术。在投诉分类方案中，我们比较了直接分类和两步分类。使用来自万隆市政府twitter的2244条twitter文本进行5倍交叉验证，在直接文本分类方案中，以支持向量机和标签功率集为算法，1-gram和投诉词的特征组合获得了准确率为70.90%的最佳实验结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Indonesian Twitter text authority classification for government in Bandung

Nowadays, social media based complaint management systems have been deployed in several countries and cities including Bandung. We proposed an automatic authority classification for Twitter text in Indonesian as part of the complaint management system. Our analysis showed that there are several Twitter message types raised in official account Twitter of the city government. The classification employed a statistical based multi-label text classification. Here, we compared several techniques in the classification such as the features, the algorithms and the classification schemes. In the features comparison, we examined several features such as the complaint word feature, n-gram feature, and the @username feature. In the algorithms comparison, we employed Decision Tree algorithm, Naïve Bayes algorithm, and Support Vector Machine algorithm with multi-label classification techniques of Binary Relevance and Label Power Set. In the complaint classification schemes, we compared the direct classification and two steps classification. Using 2244 twitter texts from twitter of Bandung city government and 5-fold cross validation, the best experimental result of 70.90% accuracy was achieved by the feature combination of 1-gram and complaint word, with Support Vector Machine and Label Power Set as the algorithm, in the direct scheme of text classification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)

自引率

0.00%

发文量