Automating Public Complaint Classification Through JakLapor Channel: A Case Study of Jakarta, Indonesia

2022 IEEE International Smart Cities Conference (ISC2) Pub Date : 2022-09-26 DOI:10.1109/ISC255366.2022.9922346

Sheila Maulida Intani, B. I. Nasution, M. E. Aminanto, Y. Nugraha, Nurhaya Muchtar, J. Kanggrawan

{"title":"Automating Public Complaint Classification Through JakLapor Channel: A Case Study of Jakarta, Indonesia","authors":"Sheila Maulida Intani, B. I. Nasution, M. E. Aminanto, Y. Nugraha, Nurhaya Muchtar, J. Kanggrawan","doi":"10.1109/ISC255366.2022.9922346","DOIUrl":null,"url":null,"abstract":"The DKI Jakarta provincial government is ready to support the digital transformation program with a series of digitally integrated policies. Residents of DKI Jakarta can now easily submit complaints about problems in their surrounding environment through the JakLapor service feature on the JAKI application. However, incoming reports are still manually classified. As a result, many citizens still report unsuitable complaints based on their category. This research aims to compare and find the best complaint classification model by applying multiple machine learning models to classify texts automatically. We also use feature extraction to see which model performs the best. This study employed Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Adaptive Boosting (AdaBoost) algorithms as the machine learning model. Meanwhile, we use Count Vectorizer, Terms Frequency-Inverse Document Frequency (TF-IDF), N-Gram, and Latent Semantic Analysis (LSA) as the feature extraction algorithms. The classification results show that the Random Forest method model with TFIDF feature extraction is the most accurate and optimal model among the others, with a 90% accuracy rate.","PeriodicalId":277015,"journal":{"name":"2022 IEEE International Smart Cities Conference (ISC2)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Smart Cities Conference (ISC2)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISC255366.2022.9922346","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The DKI Jakarta provincial government is ready to support the digital transformation program with a series of digitally integrated policies. Residents of DKI Jakarta can now easily submit complaints about problems in their surrounding environment through the JakLapor service feature on the JAKI application. However, incoming reports are still manually classified. As a result, many citizens still report unsuitable complaints based on their category. This research aims to compare and find the best complaint classification model by applying multiple machine learning models to classify texts automatically. We also use feature extraction to see which model performs the best. This study employed Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Adaptive Boosting (AdaBoost) algorithms as the machine learning model. Meanwhile, we use Count Vectorizer, Terms Frequency-Inverse Document Frequency (TF-IDF), N-Gram, and Latent Semantic Analysis (LSA) as the feature extraction algorithms. The classification results show that the Random Forest method model with TFIDF feature extraction is the most accurate and optimal model among the others, with a 90% accuracy rate.

查看原文本刊更多论文

通过JakLapor通道实现公众投诉的自动分类:以印度尼西亚雅加达为例

雅加达DKI省政府已准备好通过一系列数字集成政策来支持数字化转型计划。雅加达DKI的居民现在可以通过JAKI应用程序上的JakLapor服务功能，轻松地提交有关周围环境问题的投诉。但是，传入的报告仍然是手动分类的。因此，许多公民仍然根据他们的类别报告不适当的投诉。本研究旨在通过应用多个机器学习模型对文本进行自动分类，比较并找出最佳的投诉分类模型。我们还使用特征提取来查看哪个模型表现最好。本研究采用支持向量机(SVM)、随机森林(RF)、极端梯度增强(XGBoost)和自适应增强(AdaBoost)算法作为机器学习模型。同时，我们使用计数矢量器、术语频率-逆文档频率(TF-IDF)、N-Gram和潜在语义分析(LSA)作为特征提取算法。分类结果表明，基于TFIDF特征提取的随机森林方法模型是其中最准确、最优的模型，准确率达到90%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Smart Cities Conference (ISC2)

自引率

0.00%

发文量