印尼Twitter文本权威分类在万隆政府

Janice Laksana, A. Purwarianti
{"title":"印尼Twitter文本权威分类在万隆政府","authors":"Janice Laksana, A. Purwarianti","doi":"10.1109/ICAICTA.2014.7005928","DOIUrl":null,"url":null,"abstract":"Nowadays, social media based complaint management systems have been deployed in several countries and cities including Bandung. We proposed an automatic authority classification for Twitter text in Indonesian as part of the complaint management system. Our analysis showed that there are several Twitter message types raised in official account Twitter of the city government. The classification employed a statistical based multi-label text classification. Here, we compared several techniques in the classification such as the features, the algorithms and the classification schemes. In the features comparison, we examined several features such as the complaint word feature, n-gram feature, and the @username feature. In the algorithms comparison, we employed Decision Tree algorithm, Naïve Bayes algorithm, and Support Vector Machine algorithm with multi-label classification techniques of Binary Relevance and Label Power Set. In the complaint classification schemes, we compared the direct classification and two steps classification. Using 2244 twitter texts from twitter of Bandung city government and 5-fold cross validation, the best experimental result of 70.90% accuracy was achieved by the feature combination of 1-gram and complaint word, with Support Vector Machine and Label Power Set as the algorithm, in the direct scheme of text classification.","PeriodicalId":173600,"journal":{"name":"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Indonesian Twitter text authority classification for government in Bandung\",\"authors\":\"Janice Laksana, A. Purwarianti\",\"doi\":\"10.1109/ICAICTA.2014.7005928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, social media based complaint management systems have been deployed in several countries and cities including Bandung. We proposed an automatic authority classification for Twitter text in Indonesian as part of the complaint management system. Our analysis showed that there are several Twitter message types raised in official account Twitter of the city government. The classification employed a statistical based multi-label text classification. Here, we compared several techniques in the classification such as the features, the algorithms and the classification schemes. In the features comparison, we examined several features such as the complaint word feature, n-gram feature, and the @username feature. In the algorithms comparison, we employed Decision Tree algorithm, Naïve Bayes algorithm, and Support Vector Machine algorithm with multi-label classification techniques of Binary Relevance and Label Power Set. In the complaint classification schemes, we compared the direct classification and two steps classification. Using 2244 twitter texts from twitter of Bandung city government and 5-fold cross validation, the best experimental result of 70.90% accuracy was achieved by the feature combination of 1-gram and complaint word, with Support Vector Machine and Label Power Set as the algorithm, in the direct scheme of text classification.\",\"PeriodicalId\":173600,\"journal\":{\"name\":\"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAICTA.2014.7005928\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICTA.2014.7005928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

如今,基于社交媒体的投诉管理系统已在包括万隆在内的几个国家和城市部署。我们建议对印度尼西亚语的Twitter文本进行自动权限分类,作为投诉管理系统的一部分。我们的分析表明,在市政府公众号Twitter中提出了几种Twitter消息类型。分类采用基于统计的多标签文本分类。本文从特征、算法和分类方案等方面对几种分类技术进行了比较。在功能比较中,我们研究了几个功能,如投诉词功能、n-gram功能和@username功能。在算法比较中,我们采用了决策树算法、Naïve贝叶斯算法和支持向量机算法,并结合了二值相关和标签功率集的多标签分类技术。在投诉分类方案中,我们比较了直接分类和两步分类。使用来自万隆市政府twitter的2244条twitter文本进行5倍交叉验证,在直接文本分类方案中,以支持向量机和标签功率集为算法,1-gram和投诉词的特征组合获得了准确率为70.90%的最佳实验结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Indonesian Twitter text authority classification for government in Bandung
Nowadays, social media based complaint management systems have been deployed in several countries and cities including Bandung. We proposed an automatic authority classification for Twitter text in Indonesian as part of the complaint management system. Our analysis showed that there are several Twitter message types raised in official account Twitter of the city government. The classification employed a statistical based multi-label text classification. Here, we compared several techniques in the classification such as the features, the algorithms and the classification schemes. In the features comparison, we examined several features such as the complaint word feature, n-gram feature, and the @username feature. In the algorithms comparison, we employed Decision Tree algorithm, Naïve Bayes algorithm, and Support Vector Machine algorithm with multi-label classification techniques of Binary Relevance and Label Power Set. In the complaint classification schemes, we compared the direct classification and two steps classification. Using 2244 twitter texts from twitter of Bandung city government and 5-fold cross validation, the best experimental result of 70.90% accuracy was achieved by the feature combination of 1-gram and complaint word, with Support Vector Machine and Label Power Set as the algorithm, in the direct scheme of text classification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信