在 Twitter 上使用文本挖掘方法实施机器学习

Hamdun Sulaiman, Muhamad Ryansyah, Kudiantoro Widianto, Sidik Sidik, Andria Nugraha
{"title":"在 Twitter 上使用文本挖掘方法实施机器学习","authors":"Hamdun Sulaiman, Muhamad Ryansyah, Kudiantoro Widianto, Sidik Sidik, Andria Nugraha","doi":"10.29408/jit.v7i1.23734","DOIUrl":null,"url":null,"abstract":"Currently PT. Telkom Indonesia (Indihome), uses the role of social media as a form of concern for its customers to handle complaints. Tweets from indihome customers on social media twitter are handled by the customer service division of Indihome. The manual of the categorization process carried out by the customer service division of Indihome on every narration of the \"complain\" complaint tweet that goes to @indihome twitter, makes the process considered inefficient. The purpose of this research is to provide solutions related to the problem of categorizing complaint tweets and to develop tools that can extract the narration of \"complain\" tweets in Indonesian. The research method used is comparative. On the other hand, gataframework and rapidminer tools are also used in this research to assist in preprocessing and cleaning of datasets to help create corpus and sentiment analysis. The total dataset after cleansing and preprocessing is 1,510. Based on the method proposed in this study on the Support Vector Machine classification algorithm, the highest category was found to have 82.42% accuracy, 75.33% precision, and 98.75% recall with an AUC of 0.826","PeriodicalId":13567,"journal":{"name":"Infotek : Jurnal Informatika dan Teknologi","volume":"3 9","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Implementasi Machine Learning Dengan Metode Text Mining Pada Twitter\",\"authors\":\"Hamdun Sulaiman, Muhamad Ryansyah, Kudiantoro Widianto, Sidik Sidik, Andria Nugraha\",\"doi\":\"10.29408/jit.v7i1.23734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently PT. Telkom Indonesia (Indihome), uses the role of social media as a form of concern for its customers to handle complaints. Tweets from indihome customers on social media twitter are handled by the customer service division of Indihome. The manual of the categorization process carried out by the customer service division of Indihome on every narration of the \\\"complain\\\" complaint tweet that goes to @indihome twitter, makes the process considered inefficient. The purpose of this research is to provide solutions related to the problem of categorizing complaint tweets and to develop tools that can extract the narration of \\\"complain\\\" tweets in Indonesian. The research method used is comparative. On the other hand, gataframework and rapidminer tools are also used in this research to assist in preprocessing and cleaning of datasets to help create corpus and sentiment analysis. The total dataset after cleansing and preprocessing is 1,510. Based on the method proposed in this study on the Support Vector Machine classification algorithm, the highest category was found to have 82.42% accuracy, 75.33% precision, and 98.75% recall with an AUC of 0.826\",\"PeriodicalId\":13567,\"journal\":{\"name\":\"Infotek : Jurnal Informatika dan Teknologi\",\"volume\":\"3 9\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infotek : Jurnal Informatika dan Teknologi\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29408/jit.v7i1.23734\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infotek : Jurnal Informatika dan Teknologi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29408/jit.v7i1.23734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目前,PT.Telkom Indonesia (Indihome)将社交媒体的作用作为一种关心客户的形式来处理投诉。Indihome 客户在社交媒体 twitter 上发布的推文由 Indihome 客户服务部门处理。Indihome 客户服务部门对 @indihome twitter 上的每一条 "投诉 "推文进行人工分类,这使得处理过程效率低下。本研究旨在提供与投诉推文分类问题相关的解决方案,并开发可提取印尼语 "投诉 "推文叙述的工具。采用的研究方法是比较法。另一方面,本研究还使用了 gataframework 和 rapidminer 工具来协助预处理和清理数据集,以帮助创建语料库和进行情感分析。经过清理和预处理后的数据集总数为 1,510 个。根据本研究提出的支持向量机分类算法,发现最高类别的准确率为 82.42%,精确率为 75.33%,召回率为 98.75%,AUC 为 0.826
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Implementasi Machine Learning Dengan Metode Text Mining Pada Twitter
Currently PT. Telkom Indonesia (Indihome), uses the role of social media as a form of concern for its customers to handle complaints. Tweets from indihome customers on social media twitter are handled by the customer service division of Indihome. The manual of the categorization process carried out by the customer service division of Indihome on every narration of the "complain" complaint tweet that goes to @indihome twitter, makes the process considered inefficient. The purpose of this research is to provide solutions related to the problem of categorizing complaint tweets and to develop tools that can extract the narration of "complain" tweets in Indonesian. The research method used is comparative. On the other hand, gataframework and rapidminer tools are also used in this research to assist in preprocessing and cleaning of datasets to help create corpus and sentiment analysis. The total dataset after cleansing and preprocessing is 1,510. Based on the method proposed in this study on the Support Vector Machine classification algorithm, the highest category was found to have 82.42% accuracy, 75.33% precision, and 98.75% recall with an AUC of 0.826
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信